Re: [Wikidata-l] question about Inclusion policy discussion
Le 2013-03-14 19:30, Michael Hale a écrit : In general, I do like the idea of periodically collecting article references into Wikidata. They are a type of structured data that is associated with every article, and there are lots of interesting queries that would be easier to do if that information was in a structured database. I don't know if it will help make issues regarding finding and verifying references that we already encounter on Wikipedia any easier. I spend most of my time on the English Wikipedia, and the only times (so far) that I've intentionally gone to the article in another language are for culturally specific holidays. The only thing that I really notice is that they often have better pictures, because other than that I have to rely on Google Translate. Well, for the specific purpose we are talking about, you wouldn't need to go to other chapters[1]. Wikidata already include associated articles accross different chapters. So if we add entries for relations between statement and reference, we can also add an attribute on which article use it. And ta-da! you can distribute this reference in all associated articles in all chapters. And you can of course have an attribute to translate the statement in each supported language, so contributors can identify with which sentences in their local language article they can use it as a reference. [1] But of course if you do understand other chapters language, it would give you more context than just a systematicaly structured information. Date: Thu, 14 Mar 2013 13:50:02 +0100 From: psychosl...@culture-libre.org To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] question about Inclusion policy discussion Le 2013-03-14 02:09, Michael Hale a écrit : > I think of Wikidata as the symbiotic version of Freebase. I won't say > Freebase is a parasite, but I think a core aspect of Wikidata is that > edits to the database will often feed back into the encyclopedia in > various places. I haven't looked too much at the technical > implementation of Wikidata yet, but databases with billions of items > aren't that rare anymore. In this connection, I would like to take advantage to ask if we should include references in wikidata, and —what would be even more awesome– relations between statements/theses and a particular author. I think this could benefit wikipedia with the no-original work goal, and making references cross-chapters consistent. Moreover this could also be used to associate a statement attribution reliability and a statement relevancy reliability. Let's say I read an article on some foreign antiquity culture. This article report some statements which are, at first glanced, well sourced. But one reference happened to be a book that I can't get. A research prove me that the book indeed exists, but is no longer publicly available. So I can't check if what is claimed in the wikipedia article is what is claimed in the book. But other people may have a copy, so they could give feedback to the community confirming or invaliding that the statement can indeed be found in the book. Now an other case may be that a reference is readable directly on the internet, but the text is written in a forreign dead language that you don't know, nor find an automatic translator. So despite having the source right before your eyes, you can't check that the text make the statement. You may of course ask a validation in discussion page, or check if someone let feedback on the topic. But it would be far better if knowledgeable people feedback could be gathered whatever the chapter they use, and redistributed in all chapters. What do you think of that ? -- Association Culture-Libre http://www.culture-libre.org/ ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Association Culture-Libre http://www.culture-libre.org/ ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] question about Inclusion policy discussion
That is a tough question. We are pretty sure that we technically scale quite well, and there is no reason that the community should restrict itself out of technical reasons. If the number of item suddenly increases by one or two orders of magnitudes, we would probably meet a few hiccups on the way, but the architecture should be able to deal with that. What I am much more worried about is, is the scaling of the community though. One of my statements from my Wikidata talks is "we do not want to become the biggest data heap out there, but rather aim for an organic community, that is strong and resilient enough to maintain the data that is being collected." See also Wikidata requirement #6 < http://meta.wikimedia.org/wiki/Wikidata/Notes/Requirements> (a page worth re-reading). Sometimes it might sense for Wikidata to bridge and connect to external data sources that have their own way of maintenance and curation. Should the dataset really be merged into Wikidata? Is the data wikilike? Is it used in the Wikimedia projects? Or could it be also provided as a linked open dataset, which is referenced from Wikidata? Just to give an example: sure, one could theoretically start to collect temperature data of a city in hourly measurements*, but it could maybe make more sense to point to an external site that collects this data in a more efficient format, provide the mapping identifiers, and allow for a bot to go there and discover the data. Wikidata in turn could provide an aggregation of the data, which indeed would be used on e.g. Wikipedia and Wikivoyage, but leave the full dataset on the external site. (Which, by the way, would also be a viable solutions for datasets which have incompatible licenses). I hope this makes sense, Cheers, Denny * Actually, this kind of data would probably kill us faster than creating many items, as it would make a single item be ginormous. We scale not that well in that direction. 2013/3/14 Benjamin Good > I've been struggling to understand what should go into wikidata and what > should not. I see that this is because it hasn't been decided yet ;) > http://www.wikidata.org/wiki/Wikidata_talk:Notability > > In helping the community to make this decision I think it would be really > helpful for the developers to weigh in on the technical capacity of the > envisioned/realized wikidata infrastructure. If we know how big the system > could realistically be and continue to work well technically, it might help > discussions about how much and what kind of content we should put into it. > If the plan is to cope with only a few tens of millions of subjects that > is quite different than if the plan allows for the potential creation of > billions of items. (Suggesting less inclusive versus more inclusive > policies). > > ? > > -Ben > > ___ > Wikidata-l mailing list > Wikidata-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-l > > -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] question about Inclusion policy discussion
In general, I do like the idea of periodically collecting article references into Wikidata. They are a type of structured data that is associated with every article, and there are lots of interesting queries that would be easier to do if that information was in a structured database. I don't know if it will help make issues regarding finding and verifying references that we already encounter on Wikipedia any easier. I spend most of my time on the English Wikipedia, and the only times (so far) that I've intentionally gone to the article in another language are for culturally specific holidays. The only thing that I really notice is that they often have better pictures, because other than that I have to rely on Google Translate. > Date: Thu, 14 Mar 2013 13:50:02 +0100 > From: psychosl...@culture-libre.org > To: wikidata-l@lists.wikimedia.org > Subject: Re: [Wikidata-l] question about Inclusion policy discussion > > Le 2013-03-14 02:09, Michael Hale a écrit : > > I think of Wikidata as the symbiotic version of Freebase. I won't say > > Freebase is a parasite, but I think a core aspect of Wikidata is that > > edits to the database will often feed back into the encyclopedia in > > various places. I haven't looked too much at the technical > > implementation of Wikidata yet, but databases with billions of items > > aren't that rare anymore. > > In this connection, I would like to take advantage to ask if we should > include references in wikidata, and —what would be even more awesome– > relations between statements/theses and a particular author. I think > this could benefit wikipedia with the no-original work goal, and making > references cross-chapters consistent. > > Moreover this could also be used to associate a statement attribution > reliability and a statement relevancy reliability. Let's say I read an > article on some foreign antiquity culture. This article report some > statements which are, at first glanced, well sourced. But one reference > happened to be a book that I can't get. A research prove me that the > book indeed exists, but is no longer publicly available. So I can't > check if what is claimed in the wikipedia article is what is claimed in > the book. But other people may have a copy, so they could give feedback > to the community confirming or invaliding that the statement can indeed > be found in the book. Now an other case may be that a reference is > readable directly on the internet, but the text is written in a forreign > dead language that you don't know, nor find an automatic translator. So > despite having the source right before your eyes, you can't check that > the text make the statement. You may of course ask a validation in > discussion page, or check if someone let feedback on the topic. But it > would be far better if knowledgeable people feedback could be gathered > whatever the chapter they use, and redistributed in all chapters. > > What do you think of that ? > -- > Association Culture-Libre > http://www.culture-libre.org/ > > ___ > Wikidata-l mailing list > Wikidata-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] question about Inclusion policy discussion
Le 2013-03-14 02:09, Michael Hale a écrit : I think of Wikidata as the symbiotic version of Freebase. I won't say Freebase is a parasite, but I think a core aspect of Wikidata is that edits to the database will often feed back into the encyclopedia in various places. I haven't looked too much at the technical implementation of Wikidata yet, but databases with billions of items aren't that rare anymore. In this connection, I would like to take advantage to ask if we should include references in wikidata, and —what would be even more awesome– relations between statements/theses and a particular author. I think this could benefit wikipedia with the no-original work goal, and making references cross-chapters consistent. Moreover this could also be used to associate a statement attribution reliability and a statement relevancy reliability. Let's say I read an article on some foreign antiquity culture. This article report some statements which are, at first glanced, well sourced. But one reference happened to be a book that I can't get. A research prove me that the book indeed exists, but is no longer publicly available. So I can't check if what is claimed in the wikipedia article is what is claimed in the book. But other people may have a copy, so they could give feedback to the community confirming or invaliding that the statement can indeed be found in the book. Now an other case may be that a reference is readable directly on the internet, but the text is written in a forreign dead language that you don't know, nor find an automatic translator. So despite having the source right before your eyes, you can't check that the text make the statement. You may of course ask a validation in discussion page, or check if someone let feedback on the topic. But it would be far better if knowledgeable people feedback could be gathered whatever the chapter they use, and redistributed in all chapters. What do you think of that ? -- Association Culture-Libre http://www.culture-libre.org/ ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] question about Inclusion policy discussion
I think of Wikidata as the symbiotic version of Freebase. I won't say Freebase is a parasite, but I think a core aspect of Wikidata is that edits to the database will often feed back into the encyclopedia in various places. I haven't looked too much at the technical implementation of Wikidata yet, but databases with billions of items aren't that rare anymore. Date: Wed, 13 Mar 2013 17:51:47 -0700 From: ben.mcgee.g...@gmail.com To: wikidata-l@lists.wikimedia.org Subject: [Wikidata-l] question about Inclusion policy discussion I've been struggling to understand what should go into wikidata and what should not. I see that this is because it hasn't been decided yet ;)http://www.wikidata.org/wiki/Wikidata_talk:Notability In helping the community to make this decision I think it would be really helpful for the developers to weigh in on the technical capacity of the envisioned/realized wikidata infrastructure. If we know how big the system could realistically be and continue to work well technically, it might help discussions about how much and what kind of content we should put into it. If the plan is to cope with only a few tens of millions of subjects that is quite different than if the plan allows for the potential creation of billions of items. (Suggesting less inclusive versus more inclusive policies). ? -Ben ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] question about Inclusion policy discussion
I've been struggling to understand what should go into wikidata and what should not. I see that this is because it hasn't been decided yet ;) http://www.wikidata.org/wiki/Wikidata_talk:Notability In helping the community to make this decision I think it would be really helpful for the developers to weigh in on the technical capacity of the envisioned/realized wikidata infrastructure. If we know how big the system could realistically be and continue to work well technically, it might help discussions about how much and what kind of content we should put into it. If the plan is to cope with only a few tens of millions of subjects that is quite different than if the plan allows for the potential creation of billions of items. (Suggesting less inclusive versus more inclusive policies). ? -Ben ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l