user_editcount includes edits to deleted pages and revdeleted edits.
Erik's perl scripts use the XML dumps that do not include edits to deleted
pages.

Strictly speaking, user_editcount is a better proxy for the number of
people who have "ever edited".  Erik's is the number of people whose edits
appear in the history of a page at the time of an XML dump.

-Aaron

On Tue, Oct 27, 2015 at 9:34 AM, Jonathan Morgan <[email protected]>
wrote:

> I also wonder about this discrepancy. I ran a more explicit version of
> Andrew query, trying to eliminate some possible edge cases, and came up
> with the same number.
>
> Now I'm curious. Are there junk rows in our user table, retained for
> legacy reasons maybe? Is user_editcount inaccurate? Erik, can you describe
> the processing you perform to winnow down from 8.2 million?
>
> J
>
> On Tue, Oct 27, 2015 at 7:06 AM, Andrew Gray <[email protected]>
> wrote:
>
>> Interesting - wonder why my query's giving a higher number?
>>
>> I agree entirely that we should be very careful with quoting these
>> figures. I think you'd probably be safe to say that more than a
>> million people have edited... but even then I'd be cautious.
>>
>> Andrew.
>>
>> On 27 October 2015 at 11:11, Erik Zachte <[email protected]> wrote:
>> > Wikistats has it that 5,644,681 registered accounts published at least
>> once till Oct 1, 2015, and 2,181,006 three or more times.
>> > It used to publish that on [1][2] but I just removed it.
>> >
>> > I'm campaigning against us publishing overly inflated counts since
>> about two years (Wikimania London).
>> >
>> > Since this thread is going on and on, I'll repost my (reworded)
>> reservations on this particular metric, for newcomers.
>> >
>> > Even if we state explicitly that this is not unique people, any
>> audience will think it may be close and we are overly correct by adding the
>> caveat. It may not be so close. For that reason imo such a metric would be
>> of questionable value, to put it mildly.
>> >
>> > Pine:
>> >> Is there a way to get counts for the number of accounts, including or
>> excluding IPs, that have ever edited English Wikipedia, ?
>> >
>> > First the anon contributors: when we'd count every ip address that
>> shows up in the dumps, we'd count *very* many people who were just
>> vandalizing willfully, or just pressing edit for fun, or forgot to login
>> once, and also moved from one ip address to another over the years. On top
>> of that many people get a new ip address (from a pool) on every session,
>> depends on provider policy.
>> >
>> > As for registered editors the number Wikistats used to publish may be a
>> rather empty metric for several reasons:
>> > - How many casual editors will have forgotten their password and just
>> created a new user id? Only veteran editors know about sockpuppeting and
>> how one is supposed not to do that.
>> > - How many people will have registered in good faith just out of habit,
>> or to tweak presentation preferences, and then played with the edit button
>> just to see what happens? Note that roughly 2 out of 3 accounts doesn't
>> even reach 3 edits.
>> >
>> > Cheers,
>> > Erik Zachte
>> >
>> > [1]
>> https://stats.wikimedia.org/EN/TablesWikipediaEN.htm#editdistribution
>> > [2] BTW I use the term wikipedians overly inclusive in that report. A
>> person who edited once or twice isn't a wikipedian in my book, just like a
>> person who writes two post-it notes per month and nothing else isn't called
>> a writer. Some terms only apply above some threshold.
>> >
>> > -----Original Message-----
>> > From: Analytics [mailto:[email protected]] On
>> Behalf Of Andrew Gray
>> > Sent: Tuesday, October 27, 2015 11:06
>> > To: A mailing list for the Analytics Team at WMF and everybody who has
>> an interest in Wikipedia and analytics.
>> > Subject: Re: [Analytics] User statistics for video marking ENWP 5m
>> article milestone
>> >
>> > To a very crude approximation, there are approximately 8.2 million
>> accounts which have at least one edit on English Wikipedia - at least
>> assuming my SQL query is correct! http://quarry.wmflabs.org/query/1911
>> >
>> > This is all user accounts with one or more edits in the contributions
>> record; it does not contain IPs, and it does not contain any accounts whose
>> sole contributions have since been deleted (which is probably quite a
>> substantial number). Conversely, it includes a vast panoply of single-use
>> vandalism accounts, sockpuppets, etc etc etc. And bots, of course.
>> >
>> > Andrew.
>> >
>> > On 27 October 2015 at 05:50, Pine W <[email protected]> wrote:
>> >> Is there a way to get counts for the number of accounts, including or
>> >> excluding IPs, that have ever edited English Wikipedia, ? It would be
>> >> preferable to know the number of unique people, but of course that's
>> >> impossible.
>> >>
>> >> Thanks,
>> >> Pine
>> >>
>> >> Aha, that is important for me to know. Thanks Andrew.
>> >>
>> >> Pine
>> >>
>> >>
>> >> On Thu, Sep 17, 2015 at 11:07 AM, Andrew Gray
>> >> <[email protected]>
>> >> wrote:
>> >>>
>> >>> On 11 September 2015 at 19:19, James Forrester
>> >>> <[email protected]>
>> >>> wrote:
>> >>>
>> >>> >> Does it include editors on all Wikimedia projects
>> >>> >
>> >>> > No.
>> >>> >
>> >>> >> or just those who have registered and/or edited on ENWP?
>> >>> >
>> >>> > Registered, regardless of having edited.
>> >>>
>> >>> James is of course correct, but one small caveat worth adding:
>> >>> because of SUL, a substantial proportion of these will be
>> "autocreated"
>> >>> accounts from other projects - so even 'registration' may not mean
>> >>> what it seems.
>> >>>
>> >>> --
>> >>> - Andrew Gray
>> >>>   [email protected]
>> >>>
>> >>> _______________________________________________
>> >>> Analytics mailing list
>> >>> [email protected]
>> >>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Analytics mailing list
>> >> [email protected]
>> >> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>
>> >
>> >
>> >
>> > --
>> > - Andrew Gray
>> >   [email protected]
>> >
>> > _______________________________________________
>> > Analytics mailing list
>> > [email protected]
>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >
>> >
>> > _______________________________________________
>> > Analytics mailing list
>> > [email protected]
>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>>
>> --
>> - Andrew Gray
>>   [email protected]
>>
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>
>
>
> --
> Jonathan T. Morgan
> Senior Design Researcher
> Wikimedia Foundation
> User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
>
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to