Re: [Wikidata-l] question about Inclusion policy discussion

2013-03-15 Thread Mathieu Stumpf

Le 2013-03-14 19:30, Michael Hale a écrit :

In general, I do like the idea of periodically collecting article
references into Wikidata. They are a type of structured data that is
associated with every article, and there are lots of interesting
queries that would be easier to do if that information was in a
structured database. I don't know if it will help make issues
regarding finding and verifying references that we already encounter
on Wikipedia any easier. I spend most of my time on the English
Wikipedia, and the only times (so far) that I've intentionally gone 
to

the article in another language are for culturally specific holidays.
The only thing that I really notice is that they often have better
pictures, because other than that I have to rely on Google Translate.


Well, for the specific purpose we are talking about, you wouldn't need 
to go to other chapters[1]. Wikidata already include associated articles 
accross different chapters. So if we add entries for relations between 
statement and reference, we can also add an attribute on which article 
use it. And ta-da! you can distribute this reference in all associated 
articles in all chapters. And you can of course have an attribute to 
translate the statement in each supported language, so contributors can 
identify with which sentences in their local language article they can 
use it as a reference.


[1] But of course if you do understand other chapters language, it 
would give you more context than just a systematicaly structured 
information.



Date: Thu, 14 Mar 2013 13:50:02 +0100
From: psychosl...@culture-libre.org
To: wikidata-l@lists.wikimedia.org
Subject: Re: [Wikidata-l] question about Inclusion policy discussion

Le 2013-03-14 02:09, Michael Hale a écrit :
> I think of Wikidata as the symbiotic version of Freebase. I won't 
say
> Freebase is a parasite, but I think a core aspect of Wikidata is 
that
> edits to the database will often feed back into the encyclopedia 
in

> various places. I haven't looked too much at the technical
> implementation of Wikidata yet, but databases with billions of 
items

> aren't that rare anymore.

In this connection, I would like to take advantage to ask if we 
should
include references in wikidata, and —what would be even more 
awesome–

relations between statements/theses and a particular author. I think
this could benefit wikipedia with the no-original work goal, and 
making

references cross-chapters consistent.

Moreover this could also be used to associate a statement 
attribution
reliability and a statement relevancy reliability. Let's say I read 
an

article on some foreign antiquity culture. This article report some
statements which are, at first glanced, well sourced. But one 
reference

happened to be a book that I can't get. A research prove me that the
book indeed exists, but is no longer publicly available. So I can't
check if what is claimed in the wikipedia article is what is claimed 
in
the book. But other people may have a copy, so they could give 
feedback
to the community confirming or invaliding that the statement can 
indeed

be found in the book. Now an other case may be that a reference is
readable directly on the internet, but the text is written in a 
forreign
dead language that you don't know, nor find an automatic translator. 
So
despite having the source right before your eyes, you can't check 
that

the text make the statement. You may of course ask a validation in
discussion page, or check if someone let feedback on the topic. But 
it
would be far better if knowledgeable people feedback could be 
gathered

whatever the chapter they use, and redistributed in all chapters.

What do you think of that ?
--
Association Culture-Libre
http://www.culture-libre.org/

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


--
Association Culture-Libre
http://www.culture-libre.org/

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] question about Inclusion policy discussion

2013-03-14 Thread Denny Vrandečić
That is a tough question. We are pretty sure that we technically scale
quite well, and there is no reason that the community should restrict
itself out of technical reasons. If the number of item suddenly increases
by one or two orders of magnitudes, we would probably meet a few hiccups on
the way, but the architecture should be able to deal with that.

What I am much more worried about is, is the scaling of the community
though. One of my statements from my Wikidata talks is "we do not want to
become the biggest data heap out there, but rather aim for an organic
community, that is strong and resilient enough to maintain the data that is
being collected." See also Wikidata requirement #6 <
http://meta.wikimedia.org/wiki/Wikidata/Notes/Requirements> (a page worth
re-reading).

Sometimes it might sense for Wikidata to bridge and connect to external
data sources that have their own way of maintenance and curation. Should
the dataset really be merged into Wikidata? Is the data wikilike? Is it
used in the Wikimedia projects? Or could it be also provided as a linked
open dataset, which is referenced from Wikidata?

Just to give an example: sure, one could theoretically start to collect
temperature data of a city in hourly measurements*, but it could maybe make
more sense to point to an external site that collects this data in a more
efficient format, provide the mapping identifiers, and allow for a bot to
go there and discover the data. Wikidata in turn could provide an
aggregation of the data, which indeed would be used on e.g. Wikipedia and
Wikivoyage, but leave the full dataset on the external site.

(Which, by the way, would also be a viable solutions for datasets which
have incompatible licenses).

I hope this makes sense, Cheers,
Denny

* Actually, this kind of data would probably kill us faster than creating
many items, as it would make a single item be ginormous. We scale not that
well in that direction.



2013/3/14 Benjamin Good 

> I've been struggling to understand what should go into wikidata and what
> should not.  I see that this is because it hasn't been decided yet ;)
> http://www.wikidata.org/wiki/Wikidata_talk:Notability
>
> In helping the community to make this decision I think it would be really
> helpful for the developers to weigh in on the technical capacity of the
> envisioned/realized wikidata infrastructure.  If we know how big the system
> could realistically be and continue to work well technically, it might help
> discussions about how much and what kind of content we should put into it.
>  If the plan is to cope with only a few tens of millions of subjects that
> is quite different than if the plan allows for the potential creation of
> billions of items.  (Suggesting less inclusive versus more inclusive
> policies).
>
> ?
>
> -Ben
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>


-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] question about Inclusion policy discussion

2013-03-14 Thread Michael Hale
In general, I do like the idea of periodically collecting article references 
into Wikidata. They are a type of structured data that is associated with every 
article, and there are lots of interesting queries that would be easier to do 
if that information was in a structured database. I don't know if it will help 
make issues regarding finding and verifying references that we already 
encounter on Wikipedia any easier. I spend most of my time on the English 
Wikipedia, and the only times (so far) that I've intentionally gone to the 
article in another language are for culturally specific holidays. The only 
thing that I really notice is that they often have better pictures, because 
other than that I have to rely on Google Translate.

> Date: Thu, 14 Mar 2013 13:50:02 +0100
> From: psychosl...@culture-libre.org
> To: wikidata-l@lists.wikimedia.org
> Subject: Re: [Wikidata-l] question about Inclusion policy discussion
> 
> Le 2013-03-14 02:09, Michael Hale a écrit :
> > I think of Wikidata as the symbiotic version of Freebase. I won't say
> > Freebase is a parasite, but I think a core aspect of Wikidata is that
> > edits to the database will often feed back into the encyclopedia in
> > various places. I haven't looked too much at the technical
> > implementation of Wikidata yet, but databases with billions of items
> > aren't that rare anymore.
> 
> In this connection, I would like to take advantage to ask if we should 
> include references in wikidata, and —what would be even more awesome– 
> relations between statements/theses and a particular author. I think 
> this could benefit wikipedia with the no-original work goal, and making 
> references cross-chapters consistent.
> 
> Moreover this could also be used to associate a statement attribution 
> reliability and a statement relevancy reliability. Let's say I read an 
> article on some foreign antiquity culture. This article report some 
> statements which are, at first glanced, well sourced. But one reference 
> happened to be a book that I can't get. A research prove me that the 
> book indeed exists, but is no longer publicly available. So I can't 
> check if what is claimed in the wikipedia article is what is claimed in 
> the book. But other people may have a copy, so they could give feedback 
> to the community confirming or invaliding that the statement can indeed 
> be found in the book. Now an other case may be that a reference is 
> readable directly on the internet, but the text is written in a forreign 
> dead language that you don't know, nor find an automatic translator. So 
> despite having the source right before your eyes, you can't check that 
> the text make the statement. You may of course ask a validation in 
> discussion page, or check if someone let feedback on the topic. But it 
> would be far better if knowledgeable people feedback could be gathered 
> whatever the chapter they use, and redistributed in all chapters.
> 
> What do you think of that ?
> -- 
> Association Culture-Libre
> http://www.culture-libre.org/
> 
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
  ___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] question about Inclusion policy discussion

2013-03-14 Thread Mathieu Stumpf

Le 2013-03-14 02:09, Michael Hale a écrit :

I think of Wikidata as the symbiotic version of Freebase. I won't say
Freebase is a parasite, but I think a core aspect of Wikidata is that
edits to the database will often feed back into the encyclopedia in
various places. I haven't looked too much at the technical
implementation of Wikidata yet, but databases with billions of items
aren't that rare anymore.


In this connection, I would like to take advantage to ask if we should 
include references in wikidata, and —what would be even more awesome– 
relations between statements/theses and a particular author. I think 
this could benefit wikipedia with the no-original work goal, and making 
references cross-chapters consistent.


Moreover this could also be used to associate a statement attribution 
reliability and a statement relevancy reliability. Let's say I read an 
article on some foreign antiquity culture. This article report some 
statements which are, at first glanced, well sourced. But one reference 
happened to be a book that I can't get. A research prove me that the 
book indeed exists, but is no longer publicly available. So I can't 
check if what is claimed in the wikipedia article is what is claimed in 
the book. But other people may have a copy, so they could give feedback 
to the community confirming or invaliding that the statement can indeed 
be found in the book. Now an other case may be that a reference is 
readable directly on the internet, but the text is written in a forreign 
dead language that you don't know, nor find an automatic translator. So 
despite having the source right before your eyes, you can't check that 
the text make the statement. You may of course ask a validation in 
discussion page, or check if someone let feedback on the topic. But it 
would be far better if knowledgeable people feedback could be gathered 
whatever the chapter they use, and redistributed in all chapters.


What do you think of that ?
--
Association Culture-Libre
http://www.culture-libre.org/

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] question about Inclusion policy discussion

2013-03-13 Thread Michael Hale
I think of Wikidata as the symbiotic version of Freebase. I won't say Freebase 
is a parasite, but I think a core aspect of Wikidata is that edits to the 
database will often feed back into the encyclopedia in various places. I 
haven't looked too much at the technical implementation of Wikidata yet, but 
databases with billions of items aren't that rare anymore.

Date: Wed, 13 Mar 2013 17:51:47 -0700
From: ben.mcgee.g...@gmail.com
To: wikidata-l@lists.wikimedia.org
Subject: [Wikidata-l] question about Inclusion policy discussion

I've been struggling to understand what should go into wikidata and what should 
not.  I see that this is because it hasn't been decided yet 
;)http://www.wikidata.org/wiki/Wikidata_talk:Notability

In helping the community to make this decision I think it would be really 
helpful for the developers to weigh in on the technical capacity of the 
envisioned/realized wikidata infrastructure.  If we know how big the system 
could realistically be and continue to work well technically, it might help 
discussions about how much and what kind of content we should put into it.  If 
the plan is to cope with only a few tens of millions of subjects that is quite 
different than if the plan allows for the potential creation of billions of 
items.  (Suggesting less inclusive versus more inclusive policies).  

?
-Ben

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l 
  ___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


[Wikidata-l] question about Inclusion policy discussion

2013-03-13 Thread Benjamin Good
I've been struggling to understand what should go into wikidata and what
should not.  I see that this is because it hasn't been decided yet ;)
http://www.wikidata.org/wiki/Wikidata_talk:Notability

In helping the community to make this decision I think it would be really
helpful for the developers to weigh in on the technical capacity of the
envisioned/realized wikidata infrastructure.  If we know how big the system
could realistically be and continue to work well technically, it might help
discussions about how much and what kind of content we should put into it.
 If the plan is to cope with only a few tens of millions of subjects that
is quite different than if the plan allows for the potential creation of
billions of items.  (Suggesting less inclusive versus more inclusive
policies).

?

-Ben
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l