[Wikidata-bugs] [Maniphest] [Commented On] T213597: [REQUEST] Baselines for structured data on Commons

2019-01-25 Thread Ramsey-WMF
Ramsey-WMF added a comment. Thanks guys! This should be all we need :)TASK DETAILhttps://phabricator.wikimedia.org/T213597EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: MNeisler, Ramsey-WMFCc: Neil_P._Quinn_WMF, chelsyx, MNeisler, mpopov, kzimmerman, Ramsey-W

[Wikidata-bugs] [Maniphest] [Commented On] T213597: [REQUEST] Baselines for structured data on Commons

2019-01-23 Thread mpopov
mpopov added a comment. @Abit @Ramsey-WMF in addition to T213597#4900741, here's the history of that metric with a 7-day rolling average to smooth the daily data a bit: F28004771: 2019-01_checkin.pngTASK DETAILhttps://phabricator.wikimedia.org/T213597EMAIL PREFERENCEShttps://phabricator.wikimedia.

[Wikidata-bugs] [Maniphest] [Commented On] T213597: [REQUEST] Baselines for structured data on Commons

2019-01-23 Thread mpopov
mpopov added a comment. In T213597#4900903, @Neil_P._Quinn_WMF wrote: True, but its revisions do have revision_is_deleted set, so you've already filtered them out of your query. Huh! Yeah, you're right! Haha, okay so I think what happened was I had checked the summarized_revisions table before I

[Wikidata-bugs] [Maniphest] [Commented On] T213597: [REQUEST] Baselines for structured data on Commons

2019-01-22 Thread Neil_P._Quinn_WMF
Neil_P._Quinn_WMF added a comment. In T213597#4899804, @mpopov wrote: Thank you so much, @Neil_P._Quinn_WMF! Really appreciate you catching that and correcting. I had incorrectly assumed that initial metadata would not be included. I'm currently looking into your suggested method of filtering revi

[Wikidata-bugs] [Maniphest] [Commented On] T213597: [REQUEST] Baselines for structured data on Commons

2019-01-22 Thread mpopov
mpopov added a comment. Okay, here are the numbers which were calculated with the following conditions: Using the December 2018 snapshot of MediaWiki History in the Data Lake Only files which have not been deleted are counted Only revisions to the metadata which were not reverted AND which were n

[Wikidata-bugs] [Maniphest] [Commented On] T213597: [REQUEST] Baselines for structured data on Commons

2019-01-22 Thread mpopov
mpopov added a comment. In T213597#4893765, @Neil_P._Quinn_WMF wrote: I noticed once big thing: it seems like your counts of file page edits (n_edits_total, n_additions_total, etc.) include the initial edit that creates the pages, so in the end you're getting the proportion of files which have met

[Wikidata-bugs] [Maniphest] [Commented On] T213597: [REQUEST] Baselines for structured data on Commons

2019-01-18 Thread Neil_P._Quinn_WMF
Neil_P._Quinn_WMF added a comment. In T213597#4893605, @mpopov wrote: Here's the query I used, which I would like someone in #product-analytics (e.g. @chelsyx and @Neil_P._Quinn_WMF) to review: Sure thing! I noticed once big thing: it seems like your counts of file page edits (n_edits_total, n_

[Wikidata-bugs] [Maniphest] [Commented On] T213597: [REQUEST] Baselines for structured data on Commons

2019-01-17 Thread Ramsey-WMF
Ramsey-WMF added a comment. The statistic you want is: the % of all uploaded files which have had additions to their pages in the first 2 months after upload. Indeedy weedy. No breakdown by file type or over time, just a count X and a total Y and the proportion X/Y, correct? If it's not too much

[Wikidata-bugs] [Maniphest] [Commented On] T213597: [REQUEST] Baselines for structured data on Commons

2019-01-17 Thread mpopov
mpopov added a comment. Thanks for clarifying! Okay, one more question for @Abit & @Ramsey-WMF just so everyone is on the same page. The statistic you want is: the % of all uploaded files which have had additions to their pages in the first 2 months after upload. No breakdown by file type or over

[Wikidata-bugs] [Maniphest] [Commented On] T213597: [REQUEST] Baselines for structured data on Commons

2019-01-16 Thread Ramsey-WMF
Ramsey-WMF added a comment. Responding to @mpopov as succinctly as possible: Or are you referring to the entire page as the metadata? i.e. the whole shebang: The whole page. if someone is very thorough in their initial upload, does that file get included in the count? Or is it specifically revis

[Wikidata-bugs] [Maniphest] [Commented On] T213597: [REQUEST] Baselines for structured data on Commons

2019-01-16 Thread mpopov
mpopov added a comment. @Ramsey-WMF: hi, I would like to clarify what "metadata" includes. Here's my initial list: every field in the Information template Licensing Categories Or are you referring to the entire page as the metadata? i.e. the whole shebang: F27911262: Screen Shot 2019-01-16 at

[Wikidata-bugs] [Maniphest] [Commented On] T213597: [REQUEST] Baselines for structured data on Commons

2019-01-15 Thread Ramsey-WMF
Ramsey-WMF added a comment. In T213597#4883187, @kzimmerman wrote: @Abit @Ramsey-WMF When do you need to see the data from our team? Would end of next week work for you? End of next week would be great :). Thanks!TASK DETAILhttps://phabricator.wikimedia.org/T213597EMAIL PREFERENCEShttps://phabri

[Wikidata-bugs] [Maniphest] [Commented On] T213597: [REQUEST] Baselines for structured data on Commons

2019-01-15 Thread kzimmerman
kzimmerman added a comment. @Abit @Ramsey-WMF When do you need to see the data from our team? Would end of next week work for you?TASK DETAILhttps://phabricator.wikimedia.org/T213597EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: kzimmermanCc: kzimmerman, Rams