Re: [Wikidata-l] Wikidata logo proposal

2012-04-05 Thread Lydia Pintscher
Hey folks :)

You're all awesome for creating pretty logos! Love them.
Can you all add them to
http://meta.wikimedia.org/wiki/Talk:Wikidata#WikiData_logo_candidate
please so we don't lose any of them?


Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Community Communications for Wikidata

Wikimedia Deutschland e.V.
Eisenacher Straße 2
10777 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Wikidata: New ISO Metadata standard MLR published last year

2012-04-05 Thread Nadja Kutz
The ISO Standard sounds very interesting, but the price is really a problem.
If I understand this correctly alone the basic quantities in cutting and 
grinding part 3 are 66 CHF
http://www.iso.org/iso/iso_catalogue/catalogue_ics/catalogue_detail_ics.htm?ics1=01ics2=060ics3=csnumber=8055
which could be a rather small part of an infobox .
Also if Wikidata seems to have currently some money (does it?) to buy documents 
this can thus easily get very expensive.
moreover wikidata would need to make these standards readable for everyone in 
order to work on it and this would probably 
be a copyright issue, because this would involve not only reading only after 
purchase but also involve publishing the documents 
in a public wiki.





___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] A common, open access, human and computer readable data pool for scientist

2012-04-05 Thread Lydia Pintscher
On Thu, Apr 5, 2012 at 12:07 PM, emijrp emi...@gmail.com wrote:
 2012/4/5 Lydia Pintscher lydia.pintsc...@wikimedia.de

 However if this is not going to happen in Wikidata
 itself there is probably demand for a separate instance where this
 would be possible.


 What do you mean?

What I mean is that if this can't be done in Wikidata then someone
could go and set up a separate MediaWiki instance with the extensions
we're going to write and do that there instead.


Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Community Communications for Wikidata

Wikimedia Deutschland e.V.
Eisenacher Straße 2
10777 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] A common, open access, human and computer readable data pool for scientist

2012-04-05 Thread emijrp
Ah, ok. There will be wikidatafarms like Wikia-data : )

2012/4/5 Lydia Pintscher lydia.pintsc...@wikimedia.de

 On Thu, Apr 5, 2012 at 12:07 PM, emijrp emi...@gmail.com wrote:
  2012/4/5 Lydia Pintscher lydia.pintsc...@wikimedia.de
 
  However if this is not going to happen in Wikidata
  itself there is probably demand for a separate instance where this
  would be possible.
 
 
  What do you mean?

 What I mean is that if this can't be done in Wikidata then someone
 could go and set up a separate MediaWiki instance with the extensions
 we're going to write and do that there instead.


 Cheers
 Lydia

 --
 Lydia Pintscher - http://about.me/lydia.pintscher
 Community Communications for Wikidata

 Wikimedia Deutschland e.V.
 Eisenacher Straße 2
 10777 Berlin
 www.wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
 unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
 Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] A common, open access, human and computer readable data pool for scientist

2012-04-05 Thread JFC Morfin

Lydia wrote:

I would like that this project could serve in that manner
the scientific community and provide standards for submission of data
for scientist. Any plans in this direction?


The en.wikipedia page on big data should give the answers, if it 
was kept current with the current Big Data Research and Development 
Initiative buzz of the world 
(http://www.whitehouse.gov/blog/2012/03/29/big-data-big-deal), in 
France (http://www.bigdataparis.com/fr-index.php), etc.




Hi! It is up to the community to later decide what goes into Wikidata
and what doesn't.  So I can't give you a yes this will be ok or no
this will not happen.  However if this is not going to happen in Wikidata
itself there is probably demand for a separate instance where this would
be possible.


Lydia,

are you not afraid that an upside-down (past to possible) rather 
than a possible from present approach is to limitate us? Once we 
have finalized Wikidata as a data-store, we will have decided of its 
output/inout capacities.


jfc 



___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Wikidata: New ISO Metadata standard MLR published last year

2012-04-05 Thread Nadja Kutz

Yury Katkov: As far as I know, sometimes thedrafts of ISO standards are 
available for free. Is there a free draft version somewhere

One can get some text information within the ISO database, here is the 
documentation:
http://isotc.iso.org/livelink/livelink?func=llobjId=8421449objAction=Opennexturl=%2Flivelink%2Flivelink%3Ffunc%3Dll%26objId%3D8422698%26objAction%3Dbrowse%26sort%3Dname

but you don't get RDF specifications etc.

that is a search at:
http://www.iso.org/obp/ui/#search

for for example: cutting and grinding

(there is sofar no direct link to a search request)

reveals a lot of unclickable items, but just ISO numbers and a little text. but 
then the database browsing is still in beta.___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Data format for scientific data

2012-04-05 Thread Sofia Khatoon
Alexander,
I think the unit of measurement and uncertainty can be stored in auxiliary
Snaks.
Sofia

2012/4/5 Alexander Täschner tasc...@uni-muenster.de

 Hi!

 I am a particle physicists, so I'm interested in using the Wikidata
 project in order to keep physical constants, like the mass or lifetime of
 the neutron, in sync between different articles. In this use case it will
 be important to have not only the possibility to store and retrieve the
 value of this constant, together with the reference to the data source, but
 also the uncertainty. Would it be possible to include a special data type
 for such constants where the value, the total uncertainty and the unit of
 measurement can be stored together with the reference (the additional
 storage of statistical and systematic uncertainty would be nice, but not
 necessary).

 Best regards,
 Alexander

 __**_
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


[Wikidata-l] list for bug emails

2012-04-05 Thread Lydia Pintscher
Heya :)

There is a new mailing list that will get all the bugmail related to
Wikidata. You can subscribe at
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs if you want
to get them.


Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Community Communications for Wikidata

Wikimedia Deutschland e.V.
Eisenacher Straße 2
10777 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Output formats

2012-04-05 Thread emijrp
Semantic MediaWiki makes this correctly.

Wikipedia uses to save numeric data without thousand separator, and the use
{{formatnum:}} magic word.

For conversions, see this example http://semantic-mediawiki.org/wiki/Berlin,
the km2 is converted to sq mi using [[Corresponds to::]]
http://semantic-mediawiki.org/w/index.php?title=Property:Areaaction=edit

2012/4/5 Bináris wikipo...@gmail.com

 Hi,

 this is about the output format of numerical, money-type and date-type
 data. This is also language dependent; for example in Hungarian decimal
 sign is a comma rather then dot and the thousand separator is a space, not
 a comma. (In computer environment preferably a non breaking space.) Will
 the interface of Wikidata handle these national/local differences? I think
 this would be much more efficient than let the recipient projects transform
 data to their own format.

 Some Wikipedias use templates to translate miles to kilometers and vice
 versa. This translation often fails due to a format error and results in
 funny values. (Example: a train station that is 31 km away from the town.
 [1]) As Wikidata has controlled data, the conversion of measurment units
 would be useful to solve locally, and serve data in the desired format. (A
 mile expressed in kilometers is also a piece of data that can be stored,
 but may perhaps need stronger protection as it has an effect on the outut
 of many other data -- such as a highly used template is editable for admins
 only.)

 [1]
 http://en.wikipedia.org/w/index.php?title=Mez%C5%91keresztesdiff=nextoldid=159993221

 --
 Bináris

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Data_model: Metamodel: Wikipedialink

2012-04-05 Thread Denny Vrandečić
The label and the description together are meant to be identifying.

I.e. Georgia - A country in central Asia, or Frankfurt - A city in
Hesse, Germany, etc.

Additionally, the Wikipedia links provide quite some guidance to it.

Cheers,
Denny


2012/4/5 Gregor Hagedorn g.m.haged...@gmail.com

  Wikidata can (and probably will) store information about each moon of
  Uranus, e.g., its mass. It does probably not make sense to store the
 mass of
  Moons of Uranus if there is such an article. It does not help to know
 that
  the article Moons on Uranus also talks (among other things) about some
  moon that has a particular mass: you need to know what *exactly* you are
  talking about to exploit this data. An article on Moons of Uranus could
  still (eventually) embed Wikidata data to improve its display, but this
 data
  must refer to individual moons, not to the article as a whole.

 The problem I see is that you have no definition to which real object
 the data are tied. We agree that the problem is not the interwiki
 links per se. It is what results from it. How do we tie data to a
 wikidata page when we don't know what it is about?

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] SNAK - assertion?

2012-04-05 Thread Denny Vrandečić
Dear Martynas,

if you try to model the following statement in RDF

The population density of France, as of an 2012 estimate, is 116 per
square kilometer, according to the Bilan demographique 2010.

you might notice that RDF requires a reification of the statement. The data
model that you have seen provides us with an abstract and concise way to
talk about these reifications (i.e. via the statement model, just as in
RDF).

We still have not finished the document describing how to map our data
model to OWL/RDF, but we have thought about this the whole time while
discussing the data model.

But if you find a simpler, and more RDFish way to express the above
statement, please feel free to enlighten me. I would be indeed very
interested.

Cheers,
Denny



2012/4/5 Martynas Jusevicius marty...@graphity.org

 it doesn't look like reuse of existing concepts and standards is a
 priority for this project.
 One cannot build a Semantic Web application by ignoring its main
 building block, which is the RDF data model.


-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Data_model: Metamodel: Wikipedialink

2012-04-05 Thread Markus Krötzsch

On 04/04/12 23:23, Gregor Hagedorn wrote:

Wikidata can (and probably will) store information about each moon of
Uranus, e.g., its mass. It does probably not make sense to store the mass of
Moons of Uranus if there is such an article. It does not help to know that
the article Moons on Uranus also talks (among other things) about some
moon that has a particular mass: you need to know what *exactly* you are
talking about to exploit this data. An article on Moons of Uranus could
still (eventually) embed Wikidata data to improve its display, but this data
must refer to individual moons, not to the article as a whole.


The problem I see is that you have no definition to which real object
the data are tied. We agree that the problem is not the interwiki
links per se. It is what results from it. How do we tie data to a
wikidata page when we don't know what it is about?


This is a hard question. The best answer I can come up with now (on the 
bus to Oxford) is as follows: the meaning of Wikidata items is subject 
to social agreement, based on shared experience, communication, and 
human-language documentation. The latter is provided in labels and 
descriptions, in Wikipedia articles that are connected to a Wikidata 
item, and also in Wikidata property pages that document properties.


I know that this may not be a satisfactory answer to your question of 
how we can *really* *know* what a Wikidata item is about. If you want to 
dig deeper into this issue, there is a lot of interesting literature, 
which can give you many more details than I can. What we are dealing 
with is the well-known philosophical problem of /grounding/. In essence, 
the state of discussion boils down to the following: there is no known 
way of connecting the symbols of a purely symbolic system (such as a 
computer program) to real-world objects in a formal way. Going deeper 
into the discussion reveals that there is also no agreed-upon way to 
clarify the meaning of real and object in the first place.


In spite of all this, humans somehow manage to understand each other, 
which brings us to the point of how amazing they all are :-) Wikidata is 
but a humble technical tool that provides an environment for 
articulating and (I hope) improving this understanding in a novel way. 
This cannot provide a formal grounding, but it might come as close to 
this ideal as we have gotten yet.


Regards,

Markus

--
Dr. Markus Kroetzsch
Department of Computer Science, University of Oxford
Room 306, Parks Road, OX1 3QD Oxford, United Kingdom
+44 (0)1865 283529   http://korrekt.org/

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Type namespace

2012-04-05 Thread John McClure
Hi Denny -
Thanks for your reply and I am relieved. The design seems in the process of
walking towards looking quite alot like ISO Topic Maps, I must say, because
it designates no wall of separation between classes and topics. Today that
wall exists in SMW in the dichotomy of Category vs all-other-namespaces,
with the problems I've outlined. I'm reading into your document that there
will be no wall - that the topics describing classes surely will exist in
the same 'namespace' as the topics purported to be instances of these
classes.

Is this correct? If so, then there's less difference between ISO Topic Maps
and your design than what I had origianlly thought. If indeed the direction
of the project (as I detect on this email list) is to associate pages with
classification schemes such as LCSH or many others, then we're talking about
even more an ISO Topic Map orientation. Which brings me back to the many
benefits of a *brutally honest* adoption of the ISO Topic Map technology.

Extend  refine it for sure, but imho ISO Topic Map technology is an
excellent fit with wiki implementations. It seems to be what you're
incidentally doing anyway.
cheers - john
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Is a mailing-list efficient to discuss the Wikidata project?

2012-04-05 Thread Jan Kučera
You guys experiencing problems with reading mailing lists want
probably start using Gmail, as it nicely agregates mails in threads so
you can read it is a forum-style...

2012/4/4 Platonides platoni...@gmail.com:
 On 04/04/12 06:33, aniketkarmar...@aniketkarmarkar.com wrote:
 Hi Everyone,
 I must agree that these emails are getting a bit overwhelming. I am not
 even finding time to read more than 1 or 2 of them.

 I think a forum would be very helpful to keep at the ideas organized.
 Aniket

 Why would a forum be easier for you?
 I recommend you to group the mailing list in threads (which is
 supposedly possible with your MUA [1]).
 That way it's much easier to follow (or discard) the conversations, and
 probably gives you the benefits that you expect from a forum.


 1-http://www.horde.org/apps/imp/

 PS: Kudos to Bináris for his great message on good email clients.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Is a mailing-list efficient to discuss the Wikidata project?

2012-04-05 Thread Leukippos Institute
I have to admit that it is easy to miss the overview. What about a
Facebook group?

On Thu, Apr 5, 2012 at 11:27 PM, Jan Kučera kozuc...@gmail.com wrote:
 You guys experiencing problems with reading mailing lists want
 probably start using Gmail, as it nicely agregates mails in threads so
 you can read it is a forum-style...

 2012/4/4 Platonides platoni...@gmail.com:
 On 04/04/12 06:33, aniketkarmar...@aniketkarmarkar.com wrote:
 Hi Everyone,
 I must agree that these emails are getting a bit overwhelming. I am not
 even finding time to read more than 1 or 2 of them.

 I think a forum would be very helpful to keep at the ideas organized.
 Aniket

 Why would a forum be easier for you?
 I recommend you to group the mailing list in threads (which is
 supposedly possible with your MUA [1]).
 That way it's much easier to follow (or discard) the conversations, and
 probably gives you the benefits that you expect from a forum.


 1-http://www.horde.org/apps/imp/

 PS: Kudos to Bináris for his great message on good email clients.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Is a mailing-list efficient to discuss the Wikidata project?

2012-04-05 Thread emijrp
2012/4/6 Leukippos Institute leukipposinstit...@googlemail.com

 I have to admit that it is easy to miss the overview. What about a
 Facebook group?


You can't organize the most important project for Internet in a while using
that thing called Facebook.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Is a mailing-list efficient to discuss the Wikidata project?

2012-04-05 Thread Leukippos Institute
I am not fixed to fb. I was just thinking about a place where we can
have a structured look to the posts and the responses. I miss the
overview with all this emails. Any suggestions?

On Fri, Apr 6, 2012 at 12:13 AM, emijrp emi...@gmail.com wrote:
 2012/4/6 Leukippos Institute leukipposinstit...@googlemail.com

 I have to admit that it is easy to miss the overview. What about a
 Facebook group?


 You can't organize the most important project for Internet in a while using
 that thing called Facebook.


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Type namespace

2012-04-05 Thread Denny Vrandečić
Hi John,

no, you have seen correctly that there is no separation between classes and
instances. If this brings our model closer to topic maps, then this is
convenient.

I have to admit that my knowledge of topic maps is quite limited. As far as
I understand it, they are an ISO standard and can be bought here, e.g the
data model: 
http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=40017


We will not export our data in a format that has a description that is not
available for free.

But this is no problem anyway: if I understand topic maps correctly it
should be trivial to write a transformer that takes the export that
Wikidata will offer and translates it into topic maps, if you are so
inclined. This way the topic maps community can be served through that
transformer easily, be it a web service or a parser-front-end.

Cheers,
Denny


2012/4/5 John McClure jmccl...@hypergrove.com

 **
 Hi Denny -
 Thanks for your reply and I am relieved. The design seems in the process
 of walking towards looking quite alot like ISO Topic Maps, I must say,
 because it designates no wall of separation between classes and topics.
 Today that wall exists in SMW in the dichotomy of Category vs
 all-other-namespaces, with the problems I've outlined. I'm reading into
 your document that there will be no wall - that the topics
 describing classes surely will exist in the same 'namespace' as the topics
 purported to be instances of these classes.

 Is this correct? If so, then there's less difference between ISO Topic
 Maps and your design than what I had origianlly thought. If indeed the
 direction of the project (as I detect on this email list) is to associate
 pages with classification schemes such as LCSH or many others, then we're
 talking about even more an ISO Topic Map orientation. Which brings me back
 to the many benefits of a *brutally honest* adoption of the ISO Topic Map
 technology.

 Extend  refine it for sure, but imho ISO Topic Map technology is an
 excellent fit with wiki implementations. It seems to be what you're
 incidentally doing anyway.
 cheers - john


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] SNAK - assertion?

2012-04-05 Thread John McClure
Denny said:
But if you find a simpler, and more RDFish way to express the (below)
statement, please feel free to enlighten me. I would be indeed very
interested.

The population density of France, as of an 2012 estimate, is 116 per square
kilometer, according to the Bilan demographique 2010.

A wiki namespace-based approach is:

France has_subobject France#Density:2012_pop_estimate_Bilan_2010
France#Density:2012_pop_estimate_Bilan_2010 ^source Bilan ...2010
France#Density:2012_pop_estimate_Bilan_2010 ^npkm2 116
France#Density:2012_pop_estimate_Bilan_2010 Type Estimated
France#Density:2012_pop_estimate_Bilan_2010 ^date 2012

Key:
France#Density:2012_pop_estimate_Bilan_2010 is a named subobject
This subobject is so named to prevent subobject name collisions
Subobject names follow pagename naming conventions
has_subobject is a reserved property name
Density is an instance in a Type (or, Noun) namespace
All properties prefixed by ^ are text properties
^source and ^date are Dublin Core properties (both text properties)
Type is a Dublin Core property (an object property)
Estimated is an adjective treated as a subclass of owl:Class
Estimated is an abbreviation of Estimated Things
^npkm2 is an SI unit (amount per sq km)


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Namespace-based model

2012-04-05 Thread Denny Vrandečić
John,

your suggestion has two requirements that I think are hard to achieve:

* first, we need an agreement on the set of (non-overlapping but complete)
types that exist in the world
* second, we would need to assume that the Wikidata editors would agree on
one and exactly one type for every item, and not change that anymore. And
this seems to habe to happen during the creation of the item.

I find both assumptions rather strong. What type would Tuesday be? Or
Roman-catholic religion? What type is Love? Who decided on the types?
What are the conditions for typeness? You give Person and Place and Date as
types. So Obama is a Person. Is Gollum a Person? Hal-9000? Noah, the
builder of the ark? Enos, the chimpansee that traveled into space?

I think the assumption that everything has exactly one type is
oversimplifying.

Cheers,
Denny


2012/4/5 John McClure jmccl...@hypergrove.com

 **

 Wiki namespaces are currently so underused people may not realize their
 importance: they provide crucial semantic information. For instance,
 consider the example given in Wikidata's data model article[1]
  Obama was US Senator from Illinois from January 3, 2005 to November 16,
 2008

 which yielded these observations:

- mainSnak of type PropertyValueSnak with subject Obama, property
US Senator from, and value Illinois
- auxiliary Snak of type PropertyIntervalSnak with property in
office and interval January 3, 2005 to November 16, 2008 (the subject of
the auxiliary Snak is always the statement itself).

 An alternative lexical model might restate this as
  The US Senator for the place Illinois is/was the person Obama from date
 January 3, 2005 until date November 16, 2008.

- the prime resource being described is a US Senator page not so much
the Obama page
- Person:Obama is the subject complement of this US Senator via the
linking verb-property 'was' or is
- for is a property of this US Senator whose value is
Place:Illinois
- from is a property of this US Senator with the value Date:January
3, 2005
- until is a property of this US Senator with the value
Date:November 16, 2008

 A significant point is that that US Senator page is named Senator:Barack H
 Obama (or, Legislator:Barack H Obama or Public Employee:Barack H Obama,
 etc); it is of type US Senator, and it has these three properties, for,
 from, and until. In other words, if the content from this page is to be
 shown on the Person:Barack H Obama page, then that content should be
 transcluded from the Senator page; its semantic markup need not because
 software can interpret transcluded material as being a subject of, or
 organic to, the Person page.

 Lastly I really don't know how developers will cognitively absorb made-up
 words like Snak. The need for the term does mystify me somewhat. I do think
 everyone seems to get namespaces, appreciating the clarity they provide.
 I hope concepts like namespace can be equally as prominent at this stage
 as Snaks in the Wikidata model. Regards, 
 --Hypergrovehttp://meta.wikimedia.org/w/index.php?title=User:Hypergroveaction=editredlink=1(
 talkhttp://meta.wikimedia.org/w/index.php?title=User_talk:Hypergroveaction=editredlink=1)
 03:03, 5 April 2012 (UTC)

 [1] http://meta.wikimedia.org/w/index.php?title=Talk:Wikidata/Data_model

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Data_model: Metamodel: Wikipedialink

2012-04-05 Thread Brent Hecht
Hi All,

Our data (using a 25-language dataset) agrees with Denny's. 99% of all 
connected components of the interlanguage link graph have only one article per 
language edition. This is something we looked into in some detail in our paper 
at ACM's CHI conference this year 
(http://www.brenthecht.com/papers/bhecht_CHI2012_omnipedia.pdf).

However, it is important to point out that the 1% tends to contain articles 
that are of great general interest. Some English articles that occur in these 
situations include, author, art, indigenous people, education, 
privacy, liberal arts, computer science, agriculture, socialism, 
army, etc. To a certain extent, this is to be expected. Where there is more 
global interest in a topic, there is going to be more ambiguity.

Just my two cents.

- Brent


Brent Hecht
Ph.D. Candidate in Computer Science
CollabLab: The Collaborative Technology Laboratory
Northwestern University
w: http://www.brenthecht.com
e: br...@u.northwestern.edu


On Apr 5, 2012, at 4:50 PM, Denny Vrandečić wrote:

 Regarding definitions:
 
 Note that I said Label + Description is identifying, not merely the label. 
 I assume this to be true because even for your example of Germany, the 
 disambiguation page works with rather short descriptions of each 
 disambiguated page [1]. So even that fuzzy concept that you gave an example 
 seems to be sufficiently identifiable for the sake and mission of the 
 Wikipedia community, which gives me reason to believe that the community can 
 sort this out. I mean, they basically already had! 
 
 Regarding the Kangoo / Kubistar example:
 
 In Wikidata they would be represented as two pages, one for the Kubistar 
 (which would link to the Danish and German page for the Kubistar), and one 
 for the Kangoo (which would link to the 20 language versions of the Kangoo 
 article, including a Danish and a German one). This is a rather simple 
 example, which would be easily expressed with the exact matches that we 
 suggest.
 
 In Wikidata, the Wikipedia links are planned to be inverse functional - i.e., 
 every Wikipedia article in a specific language can only be linked to from one 
 single Wikidata article. Two Wikidata pages cannot claim the same Wikipedia 
 article in a single language as their defining article.
 
 I.e. in the Kubistar/Kangoo example there would be two Wikidata pages. One 
 about the Kubistar, linking to de:Nissan_Kubistar and da:Nissan_Kubistar, and 
 one about the Kangoo, linking to the 20 different Kangoo articles. The 
 Wikidata page for Kubistar could not link to any of those Kangoo articles.
 
 Please do not misunderstand, I am not categorically against nonexact matches 
 or broader or narrower (or else I wouldn't be discussing). But I haven't seen 
 examples yet that convince me that the additional complexity of 
 broader/narrower or unexact is required. As I said before, if we can model 
 more than 99% of all language links with the suggested simple solution, I am 
 reluctant to make it more complicated for the remaining 1%.
 
 Cheers,
 Denny
 
 P.S.: oh, yes, indeed! Thank you for this excellent and interesting 
 discussion, it really does shed light on some of the aspects of the current 
 draft of the data model, and will eventually improve it and sharpen the 
 understanding of the model. 
 
 [1] https://en.wikipedia.org/wiki/Germany_(disambiguation)
 
 
 
 2012/4/5 Gregor Hagedorn g.m.haged...@gmail.com
 On 5 April 2012 18:30, Denny Vrandečić denny.vrande...@wikimedia.de wrote:
  The label and the description together are meant to be identifying.
 
  I.e. Georgia - A country in central Asia, or Frankfurt - A city in Hesse,
  Germany, etc.
 
  Additionally, the Wikipedia links provide quite some guidance to it.
 
 I believe it will be difficult to craft labels that work as
 definitions. A label is hinting, and may often be sufficiently precise
 for the majority of purposes. If we speak of Germany it is very hard
 to express in a simple string the different historical, geographical,
 political delimitations that this term may carry.
 
 In my own field of work even technical terms are often difficult to
 resolve to a definition. In biology, the width of taxon delimitations
 changes over time and with new research, and even technical terms in
 morphologoy often have quite different meanings, depending on the
 school that is being followed.
 
 Or to cite a car example again: The label Renault Kangoo is
 unspecific as to the version/revision/release of it, so technical data
 that vary between these versions can not be added to it. However, the
 de.wikipedia.org/wiki/Nissan_Kubistar is in most Wikipedias also
 subsumed under Renault Kangoo. So it is a valid assumption that when
 labeling something Renault Kangoo it refers to both of these
 identical models sold under different names. But then, the Nissan
 Kubistar is only equivalent to the first version/revision/release of
 the Renault Kangoo...
 
 This is not unsolvable, but if you want to import 

Re: [Wikidata-l] Is a mailing-list efficient to discuss the Wikidata project?

2012-04-05 Thread Helder
On Thu, Apr 5, 2012 at 18:27, Jan Kučera kozuc...@gmail.com wrote:
 You guys experiencing problems with reading mailing lists want
 probably start using Gmail, as it nicely agregates mails in threads so
 you can read it is a forum-style...

+1

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Data_model: Metamodel: Wikipedialink

2012-04-05 Thread Denny Vrandečić
Thanks, Brent! I was hoping to get some numbers exactly from you :)

I am extremely curious what kind of statements people will make in the
Wikidata page about art, privacy, agriculture, army, etc. I am
looking forward to see what the community will add there. That'll be fun to
watch :)

(Usually, such things tend to be retroactively obvious, but extremely hard
to predict :) )

Cheers,
Denny

2012/4/6 Brent Hecht br...@u.northwestern.edu

 Hi All,

 Our data (using a 25-language dataset) agrees with Denny's. 99% of all
 connected components of the interlanguage link graph have only one article
 per language edition. This is something we looked into in some detail in
 our paper at ACM's CHI conference this year (
 http://www.brenthecht.com/papers/bhecht_CHI2012_omnipedia.pdf).

 However, it is important to point out that the 1% tends to contain
 articles that are of great general interest. Some English articles that
 occur in these situations include, author, art, indigenous people,
 education, privacy, liberal arts, computer science, agriculture,
 socialism, army, etc. To a certain extent, this is to be expected.
 Where there is more global interest in a topic, there is going to be more
 ambiguity.

 Just my two cents.

 - Brent


 Brent Hecht
 Ph.D. Candidate in Computer Science
 CollabLab: The Collaborative Technology Laboratory
 Northwestern University
 w: http://www.brenthecht.com
 e: br...@u.northwestern.edu


 On Apr 5, 2012, at 4:50 PM, Denny Vrandečić wrote:

  Regarding definitions:
 
  Note that I said Label + Description is identifying, not merely the
 label. I assume this to be true because even for your example of Germany,
 the disambiguation page works with rather short descriptions of each
 disambiguated page [1]. So even that fuzzy concept that you gave an example
 seems to be sufficiently identifiable for the sake and mission of the
 Wikipedia community, which gives me reason to believe that the community
 can sort this out. I mean, they basically already had!
 
  Regarding the Kangoo / Kubistar example:
 
  In Wikidata they would be represented as two pages, one for the Kubistar
 (which would link to the Danish and German page for the Kubistar), and one
 for the Kangoo (which would link to the 20 language versions of the Kangoo
 article, including a Danish and a German one). This is a rather simple
 example, which would be easily expressed with the exact matches that we
 suggest.
 
  In Wikidata, the Wikipedia links are planned to be inverse functional -
 i.e., every Wikipedia article in a specific language can only be linked to
 from one single Wikidata article. Two Wikidata pages cannot claim the same
 Wikipedia article in a single language as their defining article.
 
  I.e. in the Kubistar/Kangoo example there would be two Wikidata pages.
 One about the Kubistar, linking to de:Nissan_Kubistar and
 da:Nissan_Kubistar, and one about the Kangoo, linking to the 20 different
 Kangoo articles. The Wikidata page for Kubistar could not link to any of
 those Kangoo articles.
 
  Please do not misunderstand, I am not categorically against nonexact
 matches or broader or narrower (or else I wouldn't be discussing). But I
 haven't seen examples yet that convince me that the additional complexity
 of broader/narrower or unexact is required. As I said before, if we can
 model more than 99% of all language links with the suggested simple
 solution, I am reluctant to make it more complicated for the remaining 1%.
 
  Cheers,
  Denny
 
  P.S.: oh, yes, indeed! Thank you for this excellent and interesting
 discussion, it really does shed light on some of the aspects of the current
 draft of the data model, and will eventually improve it and sharpen the
 understanding of the model.
 
  [1] https://en.wikipedia.org/wiki/Germany_(disambiguation)
 
 
 
  2012/4/5 Gregor Hagedorn g.m.haged...@gmail.com
  On 5 April 2012 18:30, Denny Vrandečić denny.vrande...@wikimedia.de
 wrote:
   The label and the description together are meant to be identifying.
  
   I.e. Georgia - A country in central Asia, or Frankfurt - A city in
 Hesse,
   Germany, etc.
  
   Additionally, the Wikipedia links provide quite some guidance to it.
 
  I believe it will be difficult to craft labels that work as
  definitions. A label is hinting, and may often be sufficiently precise
  for the majority of purposes. If we speak of Germany it is very hard
  to express in a simple string the different historical, geographical,
  political delimitations that this term may carry.
 
  In my own field of work even technical terms are often difficult to
  resolve to a definition. In biology, the width of taxon delimitations
  changes over time and with new research, and even technical terms in
  morphologoy often have quite different meanings, depending on the
  school that is being followed.
 
  Or to cite a car example again: The label Renault Kangoo is
  unspecific as to the version/revision/release of it, so technical data
  that vary 

Re: [Wikidata-l] SNAK - assertion?

2012-04-05 Thread Denny Vrandečić
John,

thanks! I fully agree.
And this is indeed pretty much what we have in our data model.*
I think that we really need to get our draft mapping to RDF done, in order
to show that we align pretty much with this suggestion.

Cheers,
Denny

* well, we also add density, but I think that is merely an oversight in
your suggestion, and you forgot to add something like
France#Density:2012_pop_estimate_Bilan_2010 property Density .

2012/4/6 John McClure jmccl...@hypergrove.com

 **
 Denny said:
 But if you find a simpler, and more RDFish way to express the (below)
 statement, please feel free to enlighten me. I would be indeed very
 interested.
 The population density of France, as of an 2012 estimate, is 116 per square
 kilometer, according to the Bilan demographique 2010.
 A wiki namespace-based approach is:

 France has_subobject France#Density:2012_pop_estimate_Bilan_2010
 France#Density:2012_pop_estimate_Bilan_2010 ^source Bilan ...2010
 France#Density:2012_pop_estimate_Bilan_2010 ^npkm2 116
 France#Density:2012_pop_estimate_Bilan_2010 Type Estimated
 France#Density:2012_pop_estimate_Bilan_2010 ^date 2012

 Key:
 France#Density:2012_pop_estimate_Bilan_2010 is a named subobject
 This subobject is so named to prevent subobject name collisions
 Subobject names follow pagename naming conventions
  has_subobject is a reserved property name
 Density is an instance in a Type (or, Noun) namespace
 All properties prefixed by ^ are text properties
 ^source and ^date are Dublin Core properties (both text properties)
 Type is a Dublin Core property (an object property)
 Estimated is an adjective treated as a subclass of owl:Class
  Estimated is an abbreviation of Estimated Things
 ^npkm2 is an SI unit (amount per sq km)


 *



 *


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] SNAK - assertion?

2012-04-05 Thread John McClure
Denny said:
you forgot to add something like
France#Density:2012_pop_estimate_Bilan_2010 property Density .

No I did not forget anything, given the Density 'namespace' in the subobject
name.
IOW your triple merely restates what is discernible from the subobject name.
Maybe you should tell me what a property property is supposed to represent

At most I made a misstatement that Estimated is an adjective treated as a
subclass of owl:Class
It should say as an instance of owl:Class
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Namespace-based model

2012-04-05 Thread John McClure
It's more accurate to say that your belief is an artifact of present tools.
RDF has just one way to associate a Class with an object, the rdf:type
attribute. Specifically because RDF makes no distinction between classes
that represent a type-of-thing (eg a Character) and classes that represent a
facet-of-thing (eg Fictional), present tools require multiple classes to be
able to be associated with any resource. Obviously a given resource can have
multiple facets. In my work I store facet-classes in the Dublin Core
Coverage and Format properties and I store a single existential-class in the
Dublin Core Type property for the page; the page's template restates both
kinds of classes as Categories for the page (hence my piqued email to at
least define existential classes in a separate namespace from category).

So if no distinction is made, then multiple types are indeed necessary. If
a distinction between nouns and adjectives is made, then one type + multiple
facets is necessary.
  -Original Message-
  From: John McClure [mailto:jmccl...@hypergrove.com]
  Sent: Thursday, April 05, 2012 7:08 PM
  To: Wikidata (E-mail)
  Subject: [Wikidata-l] Namespace-based model


  Denny said:
  I think the assumption everything has exactly one type is oversimplifying

  The assumption that everything is of multiple types is over-complicating.
  Usually you can tell from the first sentence in the Wikipedia page.

  Tuesday is a day of the week
  Love is an emotion
  (Roman) Catholicism is a faith
  Gollum is a fictional character
  HAL-9000 is a character
  Noah is a Patriarch
  Enos was the first chimpanzee

  So consensus certainly is being achieved among thousands of authors about
the fundamental type of thing each of these pages represent. Disambiguation
pages very commonly reference these types of things as in Enos
(chimpanzee).

  Let's take Gollum. I can imagine a topic map has these subjects:
  1. Character
  1A. Fictional character
  1A1. Fictional person
  1A2. Fictional animal
  1A3. Fictional ghost
  1A4. Fictional god

  Another equally valid assertion is that Gollum is a Character that is
typed as Fictional and Human thing (both these adjectives that are instances
of owl:Class) -- so that a comprehensive system sometime in the future would
reinterpret that Gollum is actually a Fictional person.

  As you say yourself, it's not useful to create a perfect system to
handle every imaginable edge case **to the extent that they exist**.
Personally I don't believe such edge cases can be found - I challenge anyone
to provide me such an example.

  But more to the point of Wikidata. I don't believe for a second that WP
will be reorganized into thousands of namespaces. Rather, I believe first,
SUBOBJECT names must include the idea of 'namespace' for the efficiencies
gained, and second, WP pages should be associated with the same set of nouns
(noun-phrases) available for subobject names. IOW, it's an implementation
issue whether a wiki's pages are named using these namespaces, so that the
wiki as a whole can gain the same inherent efficiencies I've sketched for
subobjects.

  Best - john
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l