Dear Jennifer and the Product Analytics Team,

It is really nice that you prepared and updated this dataset! Thank you
very much.

My first feedback:
1) I don't understand "overall size rank". Based on the definitions, it is
the "count of unique devices which visited the wiki during that month".
What does this do with the size of the wiki?
2) I miss the *size of the database* (main namespace/content pages), which
together with the number of content pages would give an impression of the
mean size of the articles (I know this is not perfect, but better than
nothing).
3) Many more wishes about additional metrics :D

Best regards,
Samat


On Wed, 24 Feb 2021 at 10:55, Goran Milovanovic <
[email protected]> wrote:

> Hi Kate,
>
> and thank you very much for your feedback.
>
> I think I've forgotten to mention how the Wiki comparison 2020 dataset is
> so great that I will start using it in my R programming language classes as
> of today to help people learn more about hypothesis testing and join
> operations across the dataframes : )
>
> Thank you for all the hard work!
>
> > We'll keep an eye toward consistency, but we have not made the data
> extraction into a fully automated process.
> I have seen the code, I know the pain too well... All the work and then in
> the end there is always an additional detail that was maybe not considered
> in the beginning, similar things made me cry in the past in my work on
> Wikidata... I sympathise with you and the team and I wish you all the best
> in your future work!
>
> And by the way... The differences in column names are really not such a
> big deal, the variable semantics are so obvious so they match easily. Good
> work!
>
> With best wishes,
> Goran
>
> Goran S. Milovanović, PhD
> Data Scientist, Software Department
> Wikimedia Deutschland
>
> ------------------------------------------------
> "It's not the size of the dog in the fight,
> it's the size of the fight in the dog."
> - Mark Twain
> ------------------------------------------------
>
>
> On Tue, Feb 23, 2021 at 11:38 PM Kate Zimmerman <[email protected]>
> wrote:
>
>> Hi Goran,
>>
>> We'll keep an eye toward consistency, but we have not made the data
>> extraction into a fully automated process.
>>
>> We identified 3 columns that had slightly different names and we'll fix
>> them:
>> overall SIZE rank (2020) vs. overall size rank (2018, 2019)
>> second month editor retention (2020) vs. second-month new editor
>> retention (2018, 2019)
>> monthly structured discussions messages (2020) vs. monthly structured
>> discussions (Flow) messages (2018, 2019)
>>
>> The "project code" column was duplicated in 2020; the duplicate has now
>> been removed.
>>
>> Finally, in 2019 we had added 3 new columns that we hadn't tracked in
>> 2018: content pages, cumulative content edits, edits per content page.
>> Please be aware that we may add or change columns in the future as needs
>> evolve.
>>
>> Warm regards,
>> Kate
>>
>> On Tue, Feb 23, 2021 at 12:37 PM Goran Milovanovic <
>> [email protected]> wrote:
>>
>>> Well, it would be desirable to maintain consistent column names across
>>> the years...
>>>
>>> Best,
>>> Goran
>>>
>>> Goran S. Milovanović, PhD
>>> Data Scientist, Software Department
>>> Wikimedia Deutschland
>>>
>>> ------------------------------------------------
>>> "It's not the size of the dog in the fight,
>>> it's the size of the fight in the dog."
>>> - Mark Twain
>>> ------------------------------------------------
>>>
>>>
>>> On Tue, Feb 23, 2021 at 2:42 AM Jennifer Wang <[email protected]>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> For your reference we have updated wiki comparison dataset
>>>> <https://www.mediawiki.org/wiki/Product_Analytics/Comparison_datasets>
>>>> with 2020 data
>>>> <https://docs.google.com/spreadsheets/d/1a-UBqsYtJl6gpauJyanx0nyxuPqRvhzJRN817XpkuS8/edit?usp=sharing>
>>>> . If you have any feedback or suggestions, please let us know via
>>>> [email protected].
>>>>
>>>> Regards,
>>>> Jennifer & Product Analytics
>>>> _______________________________________________
>>>> Analytics mailing list
>>>> [email protected]
>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>
>>> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to