Re: [Gendergap] Sex Ratios in Wikidata Part III

2014-06-11 Thread Federico Leva (Nemo)

Lennart Guldbrandsson, 10/06/2014 00:34:

The categorization was on Swedish Wikipedia a conscious decision to try
and find out where we stood.


On it.wiki such a hammer is not needed because since 2006 all 
biographical entries are added with a template.

https://tools.wmflabs.org/personabot/?sesso=F&q=1
https://tools.wmflabs.org/personabot/?sesso=M&q=1
36975 vs. 222635.

Nemo

___
Gendergap mailing list
Gendergap@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/gendergap


Re: [Gendergap] Sex Ratios in Wikidata Part III

2014-06-10 Thread Andrew Gray
On 9 June 2014 23:34, Lennart Guldbrandsson  wrote:
> Some language versions of Wikipedia do have gender categorization, such as
> Swedish and German Wikipedia. (The English categories exist but are not used
> very much.) Here's a link to the Swedish ones:
>
> https://sv.wikipedia.org/wiki/Kategori:M%C3%A4n (men)
> presently 132 211 articles
>
> https://sv.wikipedia.org/wiki/Kategori:Kvinnor (women)
> presently 32 693 articles
>
> This gives a rough proportion of 1 female for every 4 male. article subject.
> If my memory serves me, the German Wikipedia numbers are a bit higher
> (perhaps 1 in 6).
>
> The categorization was on Swedish Wikipedia a conscious decision to try and
> find out where we stood.

Thanks - I knew about the German categories but not the Swedish ones.

Interestingly, Wikidata reports:

32661 female on svwiki:
http://tools.wmflabs.org/wikidata-todo/autolist.html?q=claim%5B31%3A5%5D%20and%20claim%5B21%3A6581072%5D%20and%20link%5Bsvwiki%5D

130801 male on svwiki:
http://tools.wmflabs.org/wikidata-todo/autolist.html?q=claim%5B31%3A5%5D%20and%20claim%5B21%3A6581097%5D%20and%20link%5Bsvwiki%5D

Wikidata gives 20% female, the Wikipedia categories give 21%, but
they're in reasonably good alignment - almost perfectly matching for
women, and about 1500 men not in Wikidata. I'll have a look at getting
these mapped across tonight :-)

-- 
- Andrew Gray
  andrew.g...@dunelm.org.uk

___
Gendergap mailing list
Gendergap@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/gendergap


Re: [Gendergap] Sex Ratios in Wikidata Part III

2014-06-09 Thread Lennart Guldbrandsson
Some language versions of Wikipedia do have gender categorization, such as 
Swedish and German Wikipedia. (The English categories exist but are not used 
very much.) Here's a link to the Swedish ones:

https://sv.wikipedia.org/wiki/Kategori:M%C3%A4n (men)
presently 132 211 articles

https://sv.wikipedia.org/wiki/Kategori:Kvinnor (women)
presently 32 693 articles

This gives a rough proportion of 1 female for every 4 male. article subject. If 
my memory serves me, the German Wikipedia numbers are a bit higher (perhaps 1 
in 6). 

The categorization was on Swedish Wikipedia a conscious decision to try and 
find out where we stood.


Best wishes,

Lennart Guldbrandsson

070 - 207 80 05
http://www.elementx.se - arbete
http://www.mrchapel.wordpress.com - personlig blogg


Presentation
@aliasHannibal - på Twitter

"Tänk dig en värld där varje människa på den här planeten får fri tillgång till 
världens samlade kunskap. Det är vårt mål."


Jimmy Wales

> From: andrew.g...@dunelm.org.uk
> Date: Mon, 9 Jun 2014 20:44:17 +0100
> To: gendergap@lists.wikimedia.org
> Subject: Re: [Gendergap] Sex Ratios in Wikidata Part III
> 
> On 9 June 2014 20:21, Nathan  wrote:
> 
> >> * WIkidata has ~2080k items marked as people
> >> * Of these, ~1893k have a "gender" property (91%)
> 
> > Can you define "item" in this context?
> 
> "Item" here is a single Wikidata entry:
> 
> http://www.wikidata.org/wiki/Q320
> 
> which may correspond to one Wikipedia article, one hundred Wikipedia
> articles, etc - but all on the same topic. (Potentially it may
> correspond to *no* Wikipedia articles - it's not strictly required,
> and in any case the source article may be deleted - but there's
> unlikely to be a statistically large number of these just now)
> 
> > Do we have any comparable data points by which to evaluate our progress?
> > Perhaps a similar breakdown of other reference works, or if there is some
> > sort of summary data available about biographies written (using LOC data?),
> > etc.
> 
> The new Oxford Dictionary of National Biography was about 10% female
> when published in 2004, though this was skewed by a limitation to
> include all entries from the original, including a lot of - to modern
> eyes - very non-notable men.
> http://oed.hertford.ox.ac.uk/main/images/stories/articles/baigent2005.pdf
> (It's since crept up to ~11%)
> 
> Max has done some numbers based on gender assigned in VIAF entries, I
> think, but I can't immediately find it. Ben Schmidt did something
> similar based on first names of authors:
> http://sappingattention.blogspot.co.uk/2012/05/women-in-libraries.html
> 
> -- 
> - Andrew Gray
>   andrew.g...@dunelm.org.uk
> 
> ___
> Gendergap mailing list
> Gendergap@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/gendergap
  ___
Gendergap mailing list
Gendergap@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/gendergap


Re: [Gendergap] Sex Ratios in Wikidata Part III

2014-06-09 Thread Andrew Gray
On 9 June 2014 20:21, Nathan  wrote:

>> * WIkidata has ~2080k items marked as people
>> * Of these, ~1893k have a "gender" property (91%)

> Can you define "item" in this context?

"Item" here is a single Wikidata entry:

http://www.wikidata.org/wiki/Q320

which may correspond to one Wikipedia article, one hundred Wikipedia
articles, etc - but all on the same topic. (Potentially it may
correspond to *no* Wikipedia articles - it's not strictly required,
and in any case the source article may be deleted - but there's
unlikely to be a statistically large number of these just now)

> Do we have any comparable data points by which to evaluate our progress?
> Perhaps a similar breakdown of other reference works, or if there is some
> sort of summary data available about biographies written (using LOC data?),
> etc.

The new Oxford Dictionary of National Biography was about 10% female
when published in 2004, though this was skewed by a limitation to
include all entries from the original, including a lot of - to modern
eyes - very non-notable men.
http://oed.hertford.ox.ac.uk/main/images/stories/articles/baigent2005.pdf
(It's since crept up to ~11%)

Max has done some numbers based on gender assigned in VIAF entries, I
think, but I can't immediately find it. Ben Schmidt did something
similar based on first names of authors:
http://sappingattention.blogspot.co.uk/2012/05/women-in-libraries.html

-- 
- Andrew Gray
  andrew.g...@dunelm.org.uk

___
Gendergap mailing list
Gendergap@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/gendergap


Re: [Gendergap] Sex Ratios in Wikidata Part III

2014-06-09 Thread Nathan
On Mon, Jun 9, 2014 at 3:17 PM, Andrew Gray 
wrote:

> Hi all,
>
> I ran a few quick updates on Max's numbers today. As of 9/6/14:
>
> * WIkidata has ~2080k items marked as people
> * Of these, ~1893k have a "gender" property (91%)
>
> (Magnus's games are doing an amazing job at filling out these numbers,
> by the way - http://magnusmanske.de/wordpress/?p=213 )
>
> Very quick and dirty statistics follow - note that since we have 9%
> undefined, the stats may change a bit as time goes on :-)
>
> * The gender breakdown across all these people is approximately 1603k
> male, 290k female - 84.7% male and 15.3% female.
>
> * enwiki is 15.5% female; arwiki 14.2%; dewiki 14.9% female; frwiki
> 15.2%; eswiki 15.9%; jawiki 18.2%; hiwiki 18.7%; zhwiki 20.1%
>
> * It's interesting to note that these numbers mostly seem a point or
> two better than the numbers Max got a month ago, which probably
> represents better data-logging rather than change in the underlying
> content
>
> * There are still very few items with a gender property other than
> "male" or "female" - perhaps 100-200 overall - but I suspect this
> number will significantly increase as we deal with the remaining
> items.
>
> Andrew.


Can you define "item" in this context?

Do we have any comparable data points by which to evaluate our progress?
Perhaps a similar breakdown of other reference works, or if there is some
sort of summary data available about biographies written (using LOC data?),
etc.
___
Gendergap mailing list
Gendergap@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/gendergap


Re: [Gendergap] Sex Ratios in Wikidata Part III

2014-06-09 Thread Andrew Gray
Hi all,

I ran a few quick updates on Max's numbers today. As of 9/6/14:

* WIkidata has ~2080k items marked as people
* Of these, ~1893k have a "gender" property (91%)

(Magnus's games are doing an amazing job at filling out these numbers,
by the way - http://magnusmanske.de/wordpress/?p=213 )

Very quick and dirty statistics follow - note that since we have 9%
undefined, the stats may change a bit as time goes on :-)

* The gender breakdown across all these people is approximately 1603k
male, 290k female - 84.7% male and 15.3% female.

* enwiki is 15.5% female; arwiki 14.2%; dewiki 14.9% female; frwiki
15.2%; eswiki 15.9%; jawiki 18.2%; hiwiki 18.7%; zhwiki 20.1%

* It's interesting to note that these numbers mostly seem a point or
two better than the numbers Max got a month ago, which probably
represents better data-logging rather than change in the underlying
content

* There are still very few items with a gender property other than
"male" or "female" - perhaps 100-200 overall - but I suspect this
number will significantly increase as we deal with the remaining
items.

Andrew.

On 22 May 2014 18:16, Maximilian Klein  wrote:
> Hi Everyone,
>
> I just conducted some new research I though you might be intrigued by.
>
> It compares the "sex or gender" labels in use by Wikidata today - 13 in
> total.
> The percentage of articles about "female"s by language.
>
> The best are Serbian Wikipedia, or Urdu Wikipedia, depending on the size you
> count.
>
> The Wiki's that have become most sexist in 2014 - English Wikpedia.
> And the Data Richness per sex value. - 6.2 Wikidata Statement per male, 6.0
> per female.
>
>
> See the full blog here, and please ask me questions and suggestions -
>
> http://notconfusing.com/sex-ratios-in-wikidata-part-iii/
>
> Max Klein
> ‽ http://notconfusing.com/
>
> ___
> Gendergap mailing list
> Gendergap@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/gendergap
>



-- 
- Andrew Gray
  andrew.g...@dunelm.org.uk

___
Gendergap mailing list
Gendergap@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/gendergap


[Gendergap] Sex Ratios in Wikidata Part III

2014-05-22 Thread Maximilian Klein
Hi Everyone,

I just conducted some new research I though you might be intrigued by.


   - It compares the "sex or gender" labels in use by Wikidata today - 13
   in total.
   - The percentage of articles about "female"s by language.
  - The best are Serbian Wikipedia, or Urdu Wikipedia, depending on the
  size you count.
   - The Wiki's that have become most sexist in 2014 - English Wikpedia.
   - And the Data Richness per sex value. - 6.2 Wikidata Statement per
   male, 6.0 per female.


See the full blog here, and please ask me questions and suggestions -
http://notconfusing.com/sex-ratios-in-wikidata-part-iii/

Max Klein
‽ http://notconfusing.com/
___
Gendergap mailing list
Gendergap@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/gendergap