Re: [Wikidata-l] Wikidata logo proposal
Hey folks :) You're all awesome for creating pretty logos! Love them. Can you all add them to http://meta.wikimedia.org/wiki/Talk:Wikidata#WikiData_logo_candidate please so we don't lose any of them? Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Wikidata Wikimedia Deutschland e.V. Eisenacher Straße 2 10777 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata: New ISO Metadata standard MLR published last year
The ISO Standard sounds very interesting, but the price is really a problem. If I understand this correctly alone the basic quantities in cutting and grinding part 3 are 66 CHF http://www.iso.org/iso/iso_catalogue/catalogue_ics/catalogue_detail_ics.htm?ics1=01ics2=060ics3=csnumber=8055 which could be a rather small part of an infobox . Also if Wikidata seems to have currently some money (does it?) to buy documents this can thus easily get very expensive. moreover wikidata would need to make these standards readable for everyone in order to work on it and this would probably be a copyright issue, because this would involve not only reading only after purchase but also involve publishing the documents in a public wiki. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] A common, open access, human and computer readable data pool for scientist
On Thu, Apr 5, 2012 at 12:07 PM, emijrp emi...@gmail.com wrote: 2012/4/5 Lydia Pintscher lydia.pintsc...@wikimedia.de However if this is not going to happen in Wikidata itself there is probably demand for a separate instance where this would be possible. What do you mean? What I mean is that if this can't be done in Wikidata then someone could go and set up a separate MediaWiki instance with the extensions we're going to write and do that there instead. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Wikidata Wikimedia Deutschland e.V. Eisenacher Straße 2 10777 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] A common, open access, human and computer readable data pool for scientist
Ah, ok. There will be wikidatafarms like Wikia-data : ) 2012/4/5 Lydia Pintscher lydia.pintsc...@wikimedia.de On Thu, Apr 5, 2012 at 12:07 PM, emijrp emi...@gmail.com wrote: 2012/4/5 Lydia Pintscher lydia.pintsc...@wikimedia.de However if this is not going to happen in Wikidata itself there is probably demand for a separate instance where this would be possible. What do you mean? What I mean is that if this can't be done in Wikidata then someone could go and set up a separate MediaWiki instance with the extensions we're going to write and do that there instead. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Wikidata Wikimedia Deutschland e.V. Eisenacher Straße 2 10777 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] A common, open access, human and computer readable data pool for scientist
Lydia wrote: I would like that this project could serve in that manner the scientific community and provide standards for submission of data for scientist. Any plans in this direction? The en.wikipedia page on big data should give the answers, if it was kept current with the current Big Data Research and Development Initiative buzz of the world (http://www.whitehouse.gov/blog/2012/03/29/big-data-big-deal), in France (http://www.bigdataparis.com/fr-index.php), etc. Hi! It is up to the community to later decide what goes into Wikidata and what doesn't. So I can't give you a yes this will be ok or no this will not happen. However if this is not going to happen in Wikidata itself there is probably demand for a separate instance where this would be possible. Lydia, are you not afraid that an upside-down (past to possible) rather than a possible from present approach is to limitate us? Once we have finalized Wikidata as a data-store, we will have decided of its output/inout capacities. jfc ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata: New ISO Metadata standard MLR published last year
Yury Katkov: As far as I know, sometimes thedrafts of ISO standards are available for free. Is there a free draft version somewhere One can get some text information within the ISO database, here is the documentation: http://isotc.iso.org/livelink/livelink?func=llobjId=8421449objAction=Opennexturl=%2Flivelink%2Flivelink%3Ffunc%3Dll%26objId%3D8422698%26objAction%3Dbrowse%26sort%3Dname but you don't get RDF specifications etc. that is a search at: http://www.iso.org/obp/ui/#search for for example: cutting and grinding (there is sofar no direct link to a search request) reveals a lot of unclickable items, but just ISO numbers and a little text. but then the database browsing is still in beta.___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data format for scientific data
Alexander, I think the unit of measurement and uncertainty can be stored in auxiliary Snaks. Sofia 2012/4/5 Alexander Täschner tasc...@uni-muenster.de Hi! I am a particle physicists, so I'm interested in using the Wikidata project in order to keep physical constants, like the mass or lifetime of the neutron, in sync between different articles. In this use case it will be important to have not only the possibility to store and retrieve the value of this constant, together with the reference to the data source, but also the uncertainty. Would it be possible to include a special data type for such constants where the value, the total uncertainty and the unit of measurement can be stored together with the reference (the additional storage of statistical and systematic uncertainty would be nice, but not necessary). Best regards, Alexander __**_ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] list for bug emails
Heya :) There is a new mailing list that will get all the bugmail related to Wikidata. You can subscribe at https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs if you want to get them. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Wikidata Wikimedia Deutschland e.V. Eisenacher Straße 2 10777 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Output formats
Semantic MediaWiki makes this correctly. Wikipedia uses to save numeric data without thousand separator, and the use {{formatnum:}} magic word. For conversions, see this example http://semantic-mediawiki.org/wiki/Berlin, the km2 is converted to sq mi using [[Corresponds to::]] http://semantic-mediawiki.org/w/index.php?title=Property:Areaaction=edit 2012/4/5 Bináris wikipo...@gmail.com Hi, this is about the output format of numerical, money-type and date-type data. This is also language dependent; for example in Hungarian decimal sign is a comma rather then dot and the thousand separator is a space, not a comma. (In computer environment preferably a non breaking space.) Will the interface of Wikidata handle these national/local differences? I think this would be much more efficient than let the recipient projects transform data to their own format. Some Wikipedias use templates to translate miles to kilometers and vice versa. This translation often fails due to a format error and results in funny values. (Example: a train station that is 31 km away from the town. [1]) As Wikidata has controlled data, the conversion of measurment units would be useful to solve locally, and serve data in the desired format. (A mile expressed in kilometers is also a piece of data that can be stored, but may perhaps need stronger protection as it has an effect on the outut of many other data -- such as a highly used template is editable for admins only.) [1] http://en.wikipedia.org/w/index.php?title=Mez%C5%91keresztesdiff=nextoldid=159993221 -- Bináris ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data_model: Metamodel: Wikipedialink
The label and the description together are meant to be identifying. I.e. Georgia - A country in central Asia, or Frankfurt - A city in Hesse, Germany, etc. Additionally, the Wikipedia links provide quite some guidance to it. Cheers, Denny 2012/4/5 Gregor Hagedorn g.m.haged...@gmail.com Wikidata can (and probably will) store information about each moon of Uranus, e.g., its mass. It does probably not make sense to store the mass of Moons of Uranus if there is such an article. It does not help to know that the article Moons on Uranus also talks (among other things) about some moon that has a particular mass: you need to know what *exactly* you are talking about to exploit this data. An article on Moons of Uranus could still (eventually) embed Wikidata data to improve its display, but this data must refer to individual moons, not to the article as a whole. The problem I see is that you have no definition to which real object the data are tied. We agree that the problem is not the interwiki links per se. It is what results from it. How do we tie data to a wikidata page when we don't know what it is about? ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] SNAK - assertion?
Dear Martynas, if you try to model the following statement in RDF The population density of France, as of an 2012 estimate, is 116 per square kilometer, according to the Bilan demographique 2010. you might notice that RDF requires a reification of the statement. The data model that you have seen provides us with an abstract and concise way to talk about these reifications (i.e. via the statement model, just as in RDF). We still have not finished the document describing how to map our data model to OWL/RDF, but we have thought about this the whole time while discussing the data model. But if you find a simpler, and more RDFish way to express the above statement, please feel free to enlighten me. I would be indeed very interested. Cheers, Denny 2012/4/5 Martynas Jusevicius marty...@graphity.org it doesn't look like reuse of existing concepts and standards is a priority for this project. One cannot build a Semantic Web application by ignoring its main building block, which is the RDF data model. -- Project director Wikidata Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data_model: Metamodel: Wikipedialink
On 04/04/12 23:23, Gregor Hagedorn wrote: Wikidata can (and probably will) store information about each moon of Uranus, e.g., its mass. It does probably not make sense to store the mass of Moons of Uranus if there is such an article. It does not help to know that the article Moons on Uranus also talks (among other things) about some moon that has a particular mass: you need to know what *exactly* you are talking about to exploit this data. An article on Moons of Uranus could still (eventually) embed Wikidata data to improve its display, but this data must refer to individual moons, not to the article as a whole. The problem I see is that you have no definition to which real object the data are tied. We agree that the problem is not the interwiki links per se. It is what results from it. How do we tie data to a wikidata page when we don't know what it is about? This is a hard question. The best answer I can come up with now (on the bus to Oxford) is as follows: the meaning of Wikidata items is subject to social agreement, based on shared experience, communication, and human-language documentation. The latter is provided in labels and descriptions, in Wikipedia articles that are connected to a Wikidata item, and also in Wikidata property pages that document properties. I know that this may not be a satisfactory answer to your question of how we can *really* *know* what a Wikidata item is about. If you want to dig deeper into this issue, there is a lot of interesting literature, which can give you many more details than I can. What we are dealing with is the well-known philosophical problem of /grounding/. In essence, the state of discussion boils down to the following: there is no known way of connecting the symbols of a purely symbolic system (such as a computer program) to real-world objects in a formal way. Going deeper into the discussion reveals that there is also no agreed-upon way to clarify the meaning of real and object in the first place. In spite of all this, humans somehow manage to understand each other, which brings us to the point of how amazing they all are :-) Wikidata is but a humble technical tool that provides an environment for articulating and (I hope) improving this understanding in a novel way. This cannot provide a formal grounding, but it might come as close to this ideal as we have gotten yet. Regards, Markus -- Dr. Markus Kroetzsch Department of Computer Science, University of Oxford Room 306, Parks Road, OX1 3QD Oxford, United Kingdom +44 (0)1865 283529 http://korrekt.org/ ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Type namespace
Hi Denny - Thanks for your reply and I am relieved. The design seems in the process of walking towards looking quite alot like ISO Topic Maps, I must say, because it designates no wall of separation between classes and topics. Today that wall exists in SMW in the dichotomy of Category vs all-other-namespaces, with the problems I've outlined. I'm reading into your document that there will be no wall - that the topics describing classes surely will exist in the same 'namespace' as the topics purported to be instances of these classes. Is this correct? If so, then there's less difference between ISO Topic Maps and your design than what I had origianlly thought. If indeed the direction of the project (as I detect on this email list) is to associate pages with classification schemes such as LCSH or many others, then we're talking about even more an ISO Topic Map orientation. Which brings me back to the many benefits of a *brutally honest* adoption of the ISO Topic Map technology. Extend refine it for sure, but imho ISO Topic Map technology is an excellent fit with wiki implementations. It seems to be what you're incidentally doing anyway. cheers - john ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Is a mailing-list efficient to discuss the Wikidata project?
You guys experiencing problems with reading mailing lists want probably start using Gmail, as it nicely agregates mails in threads so you can read it is a forum-style... 2012/4/4 Platonides platoni...@gmail.com: On 04/04/12 06:33, aniketkarmar...@aniketkarmarkar.com wrote: Hi Everyone, I must agree that these emails are getting a bit overwhelming. I am not even finding time to read more than 1 or 2 of them. I think a forum would be very helpful to keep at the ideas organized. Aniket Why would a forum be easier for you? I recommend you to group the mailing list in threads (which is supposedly possible with your MUA [1]). That way it's much easier to follow (or discard) the conversations, and probably gives you the benefits that you expect from a forum. 1-http://www.horde.org/apps/imp/ PS: Kudos to Bináris for his great message on good email clients. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Is a mailing-list efficient to discuss the Wikidata project?
I have to admit that it is easy to miss the overview. What about a Facebook group? On Thu, Apr 5, 2012 at 11:27 PM, Jan Kučera kozuc...@gmail.com wrote: You guys experiencing problems with reading mailing lists want probably start using Gmail, as it nicely agregates mails in threads so you can read it is a forum-style... 2012/4/4 Platonides platoni...@gmail.com: On 04/04/12 06:33, aniketkarmar...@aniketkarmarkar.com wrote: Hi Everyone, I must agree that these emails are getting a bit overwhelming. I am not even finding time to read more than 1 or 2 of them. I think a forum would be very helpful to keep at the ideas organized. Aniket Why would a forum be easier for you? I recommend you to group the mailing list in threads (which is supposedly possible with your MUA [1]). That way it's much easier to follow (or discard) the conversations, and probably gives you the benefits that you expect from a forum. 1-http://www.horde.org/apps/imp/ PS: Kudos to Bináris for his great message on good email clients. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Is a mailing-list efficient to discuss the Wikidata project?
2012/4/6 Leukippos Institute leukipposinstit...@googlemail.com I have to admit that it is easy to miss the overview. What about a Facebook group? You can't organize the most important project for Internet in a while using that thing called Facebook. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Is a mailing-list efficient to discuss the Wikidata project?
I am not fixed to fb. I was just thinking about a place where we can have a structured look to the posts and the responses. I miss the overview with all this emails. Any suggestions? On Fri, Apr 6, 2012 at 12:13 AM, emijrp emi...@gmail.com wrote: 2012/4/6 Leukippos Institute leukipposinstit...@googlemail.com I have to admit that it is easy to miss the overview. What about a Facebook group? You can't organize the most important project for Internet in a while using that thing called Facebook. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Type namespace
Hi John, no, you have seen correctly that there is no separation between classes and instances. If this brings our model closer to topic maps, then this is convenient. I have to admit that my knowledge of topic maps is quite limited. As far as I understand it, they are an ISO standard and can be bought here, e.g the data model: http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=40017 We will not export our data in a format that has a description that is not available for free. But this is no problem anyway: if I understand topic maps correctly it should be trivial to write a transformer that takes the export that Wikidata will offer and translates it into topic maps, if you are so inclined. This way the topic maps community can be served through that transformer easily, be it a web service or a parser-front-end. Cheers, Denny 2012/4/5 John McClure jmccl...@hypergrove.com ** Hi Denny - Thanks for your reply and I am relieved. The design seems in the process of walking towards looking quite alot like ISO Topic Maps, I must say, because it designates no wall of separation between classes and topics. Today that wall exists in SMW in the dichotomy of Category vs all-other-namespaces, with the problems I've outlined. I'm reading into your document that there will be no wall - that the topics describing classes surely will exist in the same 'namespace' as the topics purported to be instances of these classes. Is this correct? If so, then there's less difference between ISO Topic Maps and your design than what I had origianlly thought. If indeed the direction of the project (as I detect on this email list) is to associate pages with classification schemes such as LCSH or many others, then we're talking about even more an ISO Topic Map orientation. Which brings me back to the many benefits of a *brutally honest* adoption of the ISO Topic Map technology. Extend refine it for sure, but imho ISO Topic Map technology is an excellent fit with wiki implementations. It seems to be what you're incidentally doing anyway. cheers - john ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] SNAK - assertion?
Denny said: But if you find a simpler, and more RDFish way to express the (below) statement, please feel free to enlighten me. I would be indeed very interested. The population density of France, as of an 2012 estimate, is 116 per square kilometer, according to the Bilan demographique 2010. A wiki namespace-based approach is: France has_subobject France#Density:2012_pop_estimate_Bilan_2010 France#Density:2012_pop_estimate_Bilan_2010 ^source Bilan ...2010 France#Density:2012_pop_estimate_Bilan_2010 ^npkm2 116 France#Density:2012_pop_estimate_Bilan_2010 Type Estimated France#Density:2012_pop_estimate_Bilan_2010 ^date 2012 Key: France#Density:2012_pop_estimate_Bilan_2010 is a named subobject This subobject is so named to prevent subobject name collisions Subobject names follow pagename naming conventions has_subobject is a reserved property name Density is an instance in a Type (or, Noun) namespace All properties prefixed by ^ are text properties ^source and ^date are Dublin Core properties (both text properties) Type is a Dublin Core property (an object property) Estimated is an adjective treated as a subclass of owl:Class Estimated is an abbreviation of Estimated Things ^npkm2 is an SI unit (amount per sq km) ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Namespace-based model
John, your suggestion has two requirements that I think are hard to achieve: * first, we need an agreement on the set of (non-overlapping but complete) types that exist in the world * second, we would need to assume that the Wikidata editors would agree on one and exactly one type for every item, and not change that anymore. And this seems to habe to happen during the creation of the item. I find both assumptions rather strong. What type would Tuesday be? Or Roman-catholic religion? What type is Love? Who decided on the types? What are the conditions for typeness? You give Person and Place and Date as types. So Obama is a Person. Is Gollum a Person? Hal-9000? Noah, the builder of the ark? Enos, the chimpansee that traveled into space? I think the assumption that everything has exactly one type is oversimplifying. Cheers, Denny 2012/4/5 John McClure jmccl...@hypergrove.com ** Wiki namespaces are currently so underused people may not realize their importance: they provide crucial semantic information. For instance, consider the example given in Wikidata's data model article[1] Obama was US Senator from Illinois from January 3, 2005 to November 16, 2008 which yielded these observations: - mainSnak of type PropertyValueSnak with subject Obama, property US Senator from, and value Illinois - auxiliary Snak of type PropertyIntervalSnak with property in office and interval January 3, 2005 to November 16, 2008 (the subject of the auxiliary Snak is always the statement itself). An alternative lexical model might restate this as The US Senator for the place Illinois is/was the person Obama from date January 3, 2005 until date November 16, 2008. - the prime resource being described is a US Senator page not so much the Obama page - Person:Obama is the subject complement of this US Senator via the linking verb-property 'was' or is - for is a property of this US Senator whose value is Place:Illinois - from is a property of this US Senator with the value Date:January 3, 2005 - until is a property of this US Senator with the value Date:November 16, 2008 A significant point is that that US Senator page is named Senator:Barack H Obama (or, Legislator:Barack H Obama or Public Employee:Barack H Obama, etc); it is of type US Senator, and it has these three properties, for, from, and until. In other words, if the content from this page is to be shown on the Person:Barack H Obama page, then that content should be transcluded from the Senator page; its semantic markup need not because software can interpret transcluded material as being a subject of, or organic to, the Person page. Lastly I really don't know how developers will cognitively absorb made-up words like Snak. The need for the term does mystify me somewhat. I do think everyone seems to get namespaces, appreciating the clarity they provide. I hope concepts like namespace can be equally as prominent at this stage as Snaks in the Wikidata model. Regards, --Hypergrovehttp://meta.wikimedia.org/w/index.php?title=User:Hypergroveaction=editredlink=1( talkhttp://meta.wikimedia.org/w/index.php?title=User_talk:Hypergroveaction=editredlink=1) 03:03, 5 April 2012 (UTC) [1] http://meta.wikimedia.org/w/index.php?title=Talk:Wikidata/Data_model ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data_model: Metamodel: Wikipedialink
Hi All, Our data (using a 25-language dataset) agrees with Denny's. 99% of all connected components of the interlanguage link graph have only one article per language edition. This is something we looked into in some detail in our paper at ACM's CHI conference this year (http://www.brenthecht.com/papers/bhecht_CHI2012_omnipedia.pdf). However, it is important to point out that the 1% tends to contain articles that are of great general interest. Some English articles that occur in these situations include, author, art, indigenous people, education, privacy, liberal arts, computer science, agriculture, socialism, army, etc. To a certain extent, this is to be expected. Where there is more global interest in a topic, there is going to be more ambiguity. Just my two cents. - Brent Brent Hecht Ph.D. Candidate in Computer Science CollabLab: The Collaborative Technology Laboratory Northwestern University w: http://www.brenthecht.com e: br...@u.northwestern.edu On Apr 5, 2012, at 4:50 PM, Denny Vrandečić wrote: Regarding definitions: Note that I said Label + Description is identifying, not merely the label. I assume this to be true because even for your example of Germany, the disambiguation page works with rather short descriptions of each disambiguated page [1]. So even that fuzzy concept that you gave an example seems to be sufficiently identifiable for the sake and mission of the Wikipedia community, which gives me reason to believe that the community can sort this out. I mean, they basically already had! Regarding the Kangoo / Kubistar example: In Wikidata they would be represented as two pages, one for the Kubistar (which would link to the Danish and German page for the Kubistar), and one for the Kangoo (which would link to the 20 language versions of the Kangoo article, including a Danish and a German one). This is a rather simple example, which would be easily expressed with the exact matches that we suggest. In Wikidata, the Wikipedia links are planned to be inverse functional - i.e., every Wikipedia article in a specific language can only be linked to from one single Wikidata article. Two Wikidata pages cannot claim the same Wikipedia article in a single language as their defining article. I.e. in the Kubistar/Kangoo example there would be two Wikidata pages. One about the Kubistar, linking to de:Nissan_Kubistar and da:Nissan_Kubistar, and one about the Kangoo, linking to the 20 different Kangoo articles. The Wikidata page for Kubistar could not link to any of those Kangoo articles. Please do not misunderstand, I am not categorically against nonexact matches or broader or narrower (or else I wouldn't be discussing). But I haven't seen examples yet that convince me that the additional complexity of broader/narrower or unexact is required. As I said before, if we can model more than 99% of all language links with the suggested simple solution, I am reluctant to make it more complicated for the remaining 1%. Cheers, Denny P.S.: oh, yes, indeed! Thank you for this excellent and interesting discussion, it really does shed light on some of the aspects of the current draft of the data model, and will eventually improve it and sharpen the understanding of the model. [1] https://en.wikipedia.org/wiki/Germany_(disambiguation) 2012/4/5 Gregor Hagedorn g.m.haged...@gmail.com On 5 April 2012 18:30, Denny Vrandečić denny.vrande...@wikimedia.de wrote: The label and the description together are meant to be identifying. I.e. Georgia - A country in central Asia, or Frankfurt - A city in Hesse, Germany, etc. Additionally, the Wikipedia links provide quite some guidance to it. I believe it will be difficult to craft labels that work as definitions. A label is hinting, and may often be sufficiently precise for the majority of purposes. If we speak of Germany it is very hard to express in a simple string the different historical, geographical, political delimitations that this term may carry. In my own field of work even technical terms are often difficult to resolve to a definition. In biology, the width of taxon delimitations changes over time and with new research, and even technical terms in morphologoy often have quite different meanings, depending on the school that is being followed. Or to cite a car example again: The label Renault Kangoo is unspecific as to the version/revision/release of it, so technical data that vary between these versions can not be added to it. However, the de.wikipedia.org/wiki/Nissan_Kubistar is in most Wikipedias also subsumed under Renault Kangoo. So it is a valid assumption that when labeling something Renault Kangoo it refers to both of these identical models sold under different names. But then, the Nissan Kubistar is only equivalent to the first version/revision/release of the Renault Kangoo... This is not unsolvable, but if you want to import
Re: [Wikidata-l] Is a mailing-list efficient to discuss the Wikidata project?
On Thu, Apr 5, 2012 at 18:27, Jan Kučera kozuc...@gmail.com wrote: You guys experiencing problems with reading mailing lists want probably start using Gmail, as it nicely agregates mails in threads so you can read it is a forum-style... +1 ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data_model: Metamodel: Wikipedialink
Thanks, Brent! I was hoping to get some numbers exactly from you :) I am extremely curious what kind of statements people will make in the Wikidata page about art, privacy, agriculture, army, etc. I am looking forward to see what the community will add there. That'll be fun to watch :) (Usually, such things tend to be retroactively obvious, but extremely hard to predict :) ) Cheers, Denny 2012/4/6 Brent Hecht br...@u.northwestern.edu Hi All, Our data (using a 25-language dataset) agrees with Denny's. 99% of all connected components of the interlanguage link graph have only one article per language edition. This is something we looked into in some detail in our paper at ACM's CHI conference this year ( http://www.brenthecht.com/papers/bhecht_CHI2012_omnipedia.pdf). However, it is important to point out that the 1% tends to contain articles that are of great general interest. Some English articles that occur in these situations include, author, art, indigenous people, education, privacy, liberal arts, computer science, agriculture, socialism, army, etc. To a certain extent, this is to be expected. Where there is more global interest in a topic, there is going to be more ambiguity. Just my two cents. - Brent Brent Hecht Ph.D. Candidate in Computer Science CollabLab: The Collaborative Technology Laboratory Northwestern University w: http://www.brenthecht.com e: br...@u.northwestern.edu On Apr 5, 2012, at 4:50 PM, Denny Vrandečić wrote: Regarding definitions: Note that I said Label + Description is identifying, not merely the label. I assume this to be true because even for your example of Germany, the disambiguation page works with rather short descriptions of each disambiguated page [1]. So even that fuzzy concept that you gave an example seems to be sufficiently identifiable for the sake and mission of the Wikipedia community, which gives me reason to believe that the community can sort this out. I mean, they basically already had! Regarding the Kangoo / Kubistar example: In Wikidata they would be represented as two pages, one for the Kubistar (which would link to the Danish and German page for the Kubistar), and one for the Kangoo (which would link to the 20 language versions of the Kangoo article, including a Danish and a German one). This is a rather simple example, which would be easily expressed with the exact matches that we suggest. In Wikidata, the Wikipedia links are planned to be inverse functional - i.e., every Wikipedia article in a specific language can only be linked to from one single Wikidata article. Two Wikidata pages cannot claim the same Wikipedia article in a single language as their defining article. I.e. in the Kubistar/Kangoo example there would be two Wikidata pages. One about the Kubistar, linking to de:Nissan_Kubistar and da:Nissan_Kubistar, and one about the Kangoo, linking to the 20 different Kangoo articles. The Wikidata page for Kubistar could not link to any of those Kangoo articles. Please do not misunderstand, I am not categorically against nonexact matches or broader or narrower (or else I wouldn't be discussing). But I haven't seen examples yet that convince me that the additional complexity of broader/narrower or unexact is required. As I said before, if we can model more than 99% of all language links with the suggested simple solution, I am reluctant to make it more complicated for the remaining 1%. Cheers, Denny P.S.: oh, yes, indeed! Thank you for this excellent and interesting discussion, it really does shed light on some of the aspects of the current draft of the data model, and will eventually improve it and sharpen the understanding of the model. [1] https://en.wikipedia.org/wiki/Germany_(disambiguation) 2012/4/5 Gregor Hagedorn g.m.haged...@gmail.com On 5 April 2012 18:30, Denny Vrandečić denny.vrande...@wikimedia.de wrote: The label and the description together are meant to be identifying. I.e. Georgia - A country in central Asia, or Frankfurt - A city in Hesse, Germany, etc. Additionally, the Wikipedia links provide quite some guidance to it. I believe it will be difficult to craft labels that work as definitions. A label is hinting, and may often be sufficiently precise for the majority of purposes. If we speak of Germany it is very hard to express in a simple string the different historical, geographical, political delimitations that this term may carry. In my own field of work even technical terms are often difficult to resolve to a definition. In biology, the width of taxon delimitations changes over time and with new research, and even technical terms in morphologoy often have quite different meanings, depending on the school that is being followed. Or to cite a car example again: The label Renault Kangoo is unspecific as to the version/revision/release of it, so technical data that vary
Re: [Wikidata-l] SNAK - assertion?
John, thanks! I fully agree. And this is indeed pretty much what we have in our data model.* I think that we really need to get our draft mapping to RDF done, in order to show that we align pretty much with this suggestion. Cheers, Denny * well, we also add density, but I think that is merely an oversight in your suggestion, and you forgot to add something like France#Density:2012_pop_estimate_Bilan_2010 property Density . 2012/4/6 John McClure jmccl...@hypergrove.com ** Denny said: But if you find a simpler, and more RDFish way to express the (below) statement, please feel free to enlighten me. I would be indeed very interested. The population density of France, as of an 2012 estimate, is 116 per square kilometer, according to the Bilan demographique 2010. A wiki namespace-based approach is: France has_subobject France#Density:2012_pop_estimate_Bilan_2010 France#Density:2012_pop_estimate_Bilan_2010 ^source Bilan ...2010 France#Density:2012_pop_estimate_Bilan_2010 ^npkm2 116 France#Density:2012_pop_estimate_Bilan_2010 Type Estimated France#Density:2012_pop_estimate_Bilan_2010 ^date 2012 Key: France#Density:2012_pop_estimate_Bilan_2010 is a named subobject This subobject is so named to prevent subobject name collisions Subobject names follow pagename naming conventions has_subobject is a reserved property name Density is an instance in a Type (or, Noun) namespace All properties prefixed by ^ are text properties ^source and ^date are Dublin Core properties (both text properties) Type is a Dublin Core property (an object property) Estimated is an adjective treated as a subclass of owl:Class Estimated is an abbreviation of Estimated Things ^npkm2 is an SI unit (amount per sq km) * * ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] SNAK - assertion?
Denny said: you forgot to add something like France#Density:2012_pop_estimate_Bilan_2010 property Density . No I did not forget anything, given the Density 'namespace' in the subobject name. IOW your triple merely restates what is discernible from the subobject name. Maybe you should tell me what a property property is supposed to represent At most I made a misstatement that Estimated is an adjective treated as a subclass of owl:Class It should say as an instance of owl:Class ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Namespace-based model
It's more accurate to say that your belief is an artifact of present tools. RDF has just one way to associate a Class with an object, the rdf:type attribute. Specifically because RDF makes no distinction between classes that represent a type-of-thing (eg a Character) and classes that represent a facet-of-thing (eg Fictional), present tools require multiple classes to be able to be associated with any resource. Obviously a given resource can have multiple facets. In my work I store facet-classes in the Dublin Core Coverage and Format properties and I store a single existential-class in the Dublin Core Type property for the page; the page's template restates both kinds of classes as Categories for the page (hence my piqued email to at least define existential classes in a separate namespace from category). So if no distinction is made, then multiple types are indeed necessary. If a distinction between nouns and adjectives is made, then one type + multiple facets is necessary. -Original Message- From: John McClure [mailto:jmccl...@hypergrove.com] Sent: Thursday, April 05, 2012 7:08 PM To: Wikidata (E-mail) Subject: [Wikidata-l] Namespace-based model Denny said: I think the assumption everything has exactly one type is oversimplifying The assumption that everything is of multiple types is over-complicating. Usually you can tell from the first sentence in the Wikipedia page. Tuesday is a day of the week Love is an emotion (Roman) Catholicism is a faith Gollum is a fictional character HAL-9000 is a character Noah is a Patriarch Enos was the first chimpanzee So consensus certainly is being achieved among thousands of authors about the fundamental type of thing each of these pages represent. Disambiguation pages very commonly reference these types of things as in Enos (chimpanzee). Let's take Gollum. I can imagine a topic map has these subjects: 1. Character 1A. Fictional character 1A1. Fictional person 1A2. Fictional animal 1A3. Fictional ghost 1A4. Fictional god Another equally valid assertion is that Gollum is a Character that is typed as Fictional and Human thing (both these adjectives that are instances of owl:Class) -- so that a comprehensive system sometime in the future would reinterpret that Gollum is actually a Fictional person. As you say yourself, it's not useful to create a perfect system to handle every imaginable edge case **to the extent that they exist**. Personally I don't believe such edge cases can be found - I challenge anyone to provide me such an example. But more to the point of Wikidata. I don't believe for a second that WP will be reorganized into thousands of namespaces. Rather, I believe first, SUBOBJECT names must include the idea of 'namespace' for the efficiencies gained, and second, WP pages should be associated with the same set of nouns (noun-phrases) available for subobject names. IOW, it's an implementation issue whether a wiki's pages are named using these namespaces, so that the wiki as a whole can gain the same inherent efficiencies I've sketched for subobjects. Best - john ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l