Re: [Wikidata] Determining Wikidata Usage in Wikipedia Pages

2016-11-24 Thread Dimitris Kontokostas
Hi,

A related DBpedia GSoC project from this summer is described here
http://www.mail-archive.com/dbpedia-discussion@lists.sourceforge.net/msg07828.html

Some preliminary results that bootstraped this project from ~1y ago are here
https://lists.wikimedia.org/pipermail/wikidata/2015-December/007757.html


On Thu, Nov 24, 2016 at 3:07 PM, Daniel Kinzler  wrote:

> Am 23.11.2016 um 21:33 schrieb Andrew Hall:
> > Hi,
> >
> > I’m a PhD student/researcher at the University of Minnesota who (along
> with Max
> > Klein and another grad student/researcher) has been interested in
> understanding
> > the extent to which Wikidata is used in (English, for now) Wikipedia.
> >
> > There seems to be no easy way to determine Wikidata usage in Wikipedia
> pages so
> > I’ll describe two approaches we’ve considered as our best attempts at
> solving
> > this problem. I’ll also describe shortcomings of each approach.
>
> There is two pretty easy ways, which you may not have found because they
> were
> added only a couple of months ago:
>
> You can look at the "page information" (action=info, linked from the
> sidebar),
> e.g.
>  Telescope&action=info>.
> Near the bottom you can find "Wikidata entities used in this page".
>
> The same information is available via an API module,
>  wbentityusage&titles=South_Pole_Telescope>.
> See
>  query%2Bwbentityusage>
> for documentation.
>
>
> These URLs will list all direct and indirect usages, and also indicate
> what part
> or aspect of the entity was used.
>
> HTH
>
> --
> Daniel Kinzler
> Senior Software Developer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
Kontokostas Dimitris
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Determining Wikidata Usage in Wikipedia Pages

2016-11-24 Thread Daniel Kinzler
Am 23.11.2016 um 21:33 schrieb Andrew Hall:
> Hi,
> 
> I’m a PhD student/researcher at the University of Minnesota who (along with 
> Max
> Klein and another grad student/researcher) has been interested in 
> understanding
> the extent to which Wikidata is used in (English, for now) Wikipedia.
> 
> There seems to be no easy way to determine Wikidata usage in Wikipedia pages 
> so
> I’ll describe two approaches we’ve considered as our best attempts at solving
> this problem. I’ll also describe shortcomings of each approach.

There is two pretty easy ways, which you may not have found because they were
added only a couple of months ago:

You can look at the "page information" (action=info, linked from the sidebar),
e.g.
.
Near the bottom you can find "Wikidata entities used in this page".

The same information is available via an API module,
.
See

for documentation.


These URLs will list all direct and indirect usages, and also indicate what part
or aspect of the entity was used.

HTH

-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Determining Wikidata Usage in Wikipedia Pages

2016-11-24 Thread Léa Lacroix
Hello,

That's a very interesting topic, I'm looking forward to seeing the results
:)

You may already know that we're already tracking entity usage on Wikipedia.
Example : https://en.wikipedia.org/wiki/Barack_Obama?action=info section
"Wikidata entities used in this page"
Documentation and several ways to access these informations :
https://www.wikidata.org/wiki/Wikidata:Entity_Usage

If you have any questions, feel free to contact me.
Bests,

On 24 November 2016 at 14:52, Nicolas VIGNERON 
wrote:

> Great idea.
>
> The first approach involves analyzing Wikipedia templates to look for
>> explicit references (i.e. “#property:P”) across all templates.
>>
>
> This syntax is rarely never used on fr.wiki (it's even now forbidden in
> the main namespace) where almost all calls to Wikidata is done through
> modules (we had an RFC which *kind of* make it mandatory).
>
> Meanwhile on fr.wiki, we extensively use categories (automatically added)
> for tracking Wikidata, main cat is https://fr.wikipedia.org/wiki/
> Cat%C3%A9gorie:Page_utilisant_Wikidata_par_propri%C3%A9t%C3%A9 ; I'm
> curious to have a comparison to see if a lot of Wikidata calls are
> untracked by categories.
>
> Cdlt, ~nicolas
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>


-- 
Léa Lacroix
Community Communication Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Determining Wikidata Usage in Wikipedia Pages

2016-11-24 Thread Nicolas VIGNERON
Great idea.

The first approach involves analyzing Wikipedia templates to look for
> explicit references (i.e. “#property:P”) across all templates.
>

This syntax is rarely never used on fr.wiki (it's even now forbidden in the
main namespace) where almost all calls to Wikidata is done through modules
(we had an RFC which *kind of* make it mandatory).

Meanwhile on fr.wiki, we extensively use categories (automatically added)
for tracking Wikidata, main cat is
https://fr.wikipedia.org/wiki/Cat%C3%A9gorie:Page_utilisant_Wikidata_par_propri%C3%A9t%C3%A9
; I'm curious to have a comparison to see if a lot of Wikidata calls are
untracked by categories.

Cdlt, ~nicolas
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] Determining Wikidata Usage in Wikipedia Pages

2016-11-24 Thread Andrew Hall
Hi,

I’m a PhD student/researcher at the University of Minnesota who (along with Max 
Klein and another grad student/researcher) has been interested in understanding 
the extent to which Wikidata is used in (English, for now) Wikipedia. 

There seems to be no easy way to determine Wikidata usage in Wikipedia pages so 
I’ll describe two approaches we’ve considered as our best attempts at solving 
this problem. I’ll also describe shortcomings of each approach. 

The first approach involves analyzing Wikipedia templates to look for explicit 
references (i.e. “#property:P”) across all templates. For a given 
template containing a certain property reference, we then assume that the 
statement corresponding to the Wikidata property is used in all Wikipedia pages 
that transclude that template. However, there are two clear limitations to this 
approach:
If we assume that the statement corresponding to the Wikidata property is used 
in all Wikipedia pages that transclude that template, this results in a sort of 
upper bound on the number of actual property usages in Wikipedia. However, we 
have no sense of what the actual usage looks like since each template has its 
own set of logic and, whether or not a given property would get rendered in 
Wikipedia is dependent on that (sometimes quite complicated) logic. A possible 
way to get a sense of usage would be to sample a small set of random pages 
(that use templates using Wikidata)  and manually look up whether or not the 
Wikidata statement for the given Wikidata item 
 is exactly the same as that rendered 
in the corresponding Wikipedia page. If it was, then we might assume the 
property is being used. Of course, this is not a perfect approach since it's 
possible that a Wikidata statement is used in Wikipedia but it is formatted 
differently in Wikidata versus in Wikipedia (e.g. a date is rendered using a 
different format).
This approach does not account for Lua modules, which can be referenced from 
within templates. The modules can (and sometimes do) contain code that supplies 
Wikidata to Wikipedia pages that are transcluded by the given templates 
containing the module references. Without understanding and accounting for the 
logic in all Lua modules that use Wikidata, it does not seem possible to 
actually know which Wikidata properties are being introduced to Wikipedia pages 
through this method.

The second approach involves expanding (using the MediaWiki API, see 
https://www.mediawiki.org/wiki/API:Expandtemplates 
) already transcluded 
templates into HTML tables in two ways: 1) in the context of the appropriate 
Wikipedia page and 2) out of context of the appropriate Wikipedia page (e.g. in 
my own sandbox). It’s my understanding that if the Wikipedia page uses 
Wikidata, then that Wikidata should show up in the expansion if the template is 
expanded in the context of its page, and not when expanded elsewhere (e.g. in 
my sandbox). We would then check to see if there is a difference between the 
two expansions by html diff-ing. The difference between the two expanded 
templates would presumably be due to Wikidata. Of course, there are limitations 
to this approach as well:
It's possible that a Wikipedia contributor manually entered in data (into a 
transcluded template) that exactly matches data in Wikidata and thus, the 
expansions would be the same across the diff-ing — Wikidata would not be 
recognizable in this case. 
Once we identify (through diff-ing) where Wikidata is being used in expanded 
templates, it's not obvious what specific Wikidata properties/statements were 
used. In other words, "linking" Wikidata to corresponding html (table) rows in 
an expanded template seems challenging.

Any insight about how we can approach this problem would be greatly appreciated!

Thanks,
Andrew Hall___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Importing translated names of important cities

2016-11-24 Thread Susanna Ånäs
Hi! Seems like a superb project! I see no reason why the import should be
carried out by a non-paid volunteer. Having it done by one person as
opposed to many will keep the the data quality consistent.

I would be happy if this experience would get logged / reported in  WikiProject
Historical Place
. This
info is not necessarily historical in nature, but any experiences guiding
the place name practices would be highly valuable. - If necessary, a
generic place name project could be initiated.

Cheers
Susanna Ånäs

2016-11-24 13:20 GMT+02:00 Arun Ganesh :

> Those interested in translations of world places, please give your
> feedback on this discussion [1]. There is a dataset of professionally
> translated list of world places [2] that has already been through a round
> of independent review by native language speakers and is open to be used to
> improve Wikidata.
>
> 331 places were translated into 8 languages. Of the 2,648 translations
> that were received, 1148 are new or differ from existing label
> translations, while 1508 match perfectly with existing labels.
>
> From the discussion so far it seems like the next steps would be to get
> the translators to update the Wikidata entries from their own accounts with
> a paid contribution disclaimer as per [3]. The other option is to get
> community members in each language to once again do a review and update the
> corresponding labels. Any suggestions of which is the recommended way to go
> about this?
>
> [1] https://www.wikidata.org/wiki/Wikidata:Project_chat#
> Data_donation_of_translated_place_names
> [2] https://docs.google.com/spreadsheets/d/1SKVi9PZ_
> 1ebxwWJAGshLQvkrJ5esOUwPHEkkj-LMr7w/edit#gid=1482753214
> [3] https://en.wikipedia.org/wiki/Wikipedia:Paid-contribution_disclosure
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] Importing translated names of important cities

2016-11-24 Thread Arun Ganesh
Those interested in translations of world places, please give your feedback
on this discussion [1]. There is a dataset of professionally translated
list of world places [2] that has already been through a round of
independent review by native language speakers and is open to be used to
improve Wikidata.

331 places were translated into 8 languages. Of the 2,648 translations that
were received, 1148 are new or differ from existing label translations,
while 1508 match perfectly with existing labels.

>From the discussion so far it seems like the next steps would be to get the
translators to update the Wikidata entries from their own accounts with a
paid contribution disclaimer as per [3]. The other option is to get
community members in each language to once again do a review and update the
corresponding labels. Any suggestions of which is the recommended way to go
about this?

[1]
https://www.wikidata.org/wiki/Wikidata:Project_chat#Data_donation_of_translated_place_names
[2]
https://docs.google.com/spreadsheets/d/1SKVi9PZ_1ebxwWJAGshLQvkrJ5esOUwPHEkkj-LMr7w/edit#gid=1482753214
[3] https://en.wikipedia.org/wiki/Wikipedia:Paid-contribution_disclosure
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata