[Wikidata-bugs] [Maniphest] [Commented On] T213597: [REQUEST] Baselines for structured data on Commons

2019-01-22 Thread Neil_P._Quinn_WMF
Neil_P._Quinn_WMF added a comment.

In T213597#4899804, @mpopov wrote:
Thank you so much, @Neil_P._Quinn_WMF! Really appreciate you catching that and correcting. I had incorrectly assumed that initial metadata would not be included. I'm currently looking into your suggested method of filtering revisions and comparing it to using revision_parent_id > 0, which should theoretically yield the same result but is not the case in practice.


Glad to help! mediawiki_history is full of so many gotchas; I'll fallen into plenty myself!


In T213597#4899804, @mpopov wrote:

In T213597#4893765, @Neil_P._Quinn_WMF wrote:
I don't understand the point of this, since the NOT revision_is_deleted should have already removed deleted files. (Also the page_id isn't necessarily null for deleted pages; after all the MediaWiki archive table has ar_page_id.)


https://commons.wikimedia.org/wiki/File:Box-Front.jpg is a deleted file with a null page_id and it gets included in summarized_revisions otherwise.


True, but its revisions do have revision_is_deleted set, so you've already filtered them out of your query.TASK DETAILhttps://phabricator.wikimedia.org/T213597EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Neil_P._Quinn_WMFCc: Neil_P._Quinn_WMF, chelsyx, MNeisler, mpopov, kzimmerman, Ramsey-WMF, Abit, JKSTNK, Lahi, PDrouin-WMF, E1presidente, Cparle, Anooprao, SandraF_WMF, Tramullas, Acer, Silverfish, Susannaanas, Jane023, Wikidata-bugs, Base, matthiasmullie, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T213597: [REQUEST] Baselines for structured data on Commons

2019-01-18 Thread Neil_P._Quinn_WMF
Neil_P._Quinn_WMF added a comment.

In T213597#4893605, @mpopov wrote:
Here's the query I used, which I would like someone in #product-analytics (e.g. @chelsyx and @Neil_P._Quinn_WMF) to review:


Sure thing!

I noticed once big thing: it seems like your counts of file page edits (n_edits_total, n_additions_total, etc.) include the initial edit that creates the pages, so in the end you're getting the proportion of files which have metadata added in the first 2 months, including during the initial upload.

I tried excluding those initial creations (event_timestamp != page_creation_timestamp), and it looks like the proportion goes from 99% to 50%.

Query excluding intial creations

WITH summarized_revisions AS (
  SELECT
page_id, TO_DATE(page_creation_timestamp) AS creation_date,
COUNT(1) AS n_edits, -- not including reverts or reverted
SUM(IF(event_timestamp != page_creation_timestamp, 1, 0)) as n_later_edits,
SUM(IF(revision_text_bytes_diff > 0 AND DATEDIFF(event_timestamp, page_creation_timestamp) <= 60 AND event_timestamp != page_creation_timestamp, 1, 0)) AS n_additions_2mo
  FROM wmf.mediawiki_history
  WHERE snapshot = '2018-12'
AND wiki_db = 'commonswiki'
AND page_creation_timestamp between "2018-10-01" and "2018-10-08"
AND event_entity = 'revision'
AND page_namespace = 6
AND NOT revision_is_identity_revert -- don't count edits that are reverts
AND NOT revision_is_identity_reverted -- don't count edits that were reverted
AND NOT revision_is_deleted -- don't counts edits moved to archive table
AND page_id IS NOT NULL -- don't count deleted files
  GROUP BY page_id, TO_DATE(page_creation_timestamp)
)
SELECT
  creation_date,
  COUNT(1) AS n_uploaded, -- files uploaded
  SUM(IF(n_later_edits > 0, 1, 0)) AS n_later_edited, -- files whose pages were edited after upload
  SUM(IF(n_additions_2mo > 0, 1, 0)) AS n_added_to_2mo -- files that have had metadata added after creation and in first 2 months
  FROM summarized_revisions
GROUP BY creation_date;


creation_daten_uploadedn_later_editedn_added_to_2mo
2018-10-01233901330710248
2018-10-0218226113088947
2018-10-03227631680312142
2018-10-0417455128969088
2018-10-05173211139710261
2018-10-06201911245610558
2018-10-0721479115759853





Other comments

WITH summarized_revisions AS (
  SELECT
page_id, TO_DATE(page_creation_timestamp) AS creation_date,
COUNT(1) AS n_edits_total, -- not including reverts or reverted

I think this includes uploads of new file versions, not just metadata edits, but I don't think it would change the results much.

  SUM(IF(revision_text_bytes_diff > 0, 1, 0)) AS n_additions_total,
  SUM(IF(DATEDIFF(event_timestamp, page_creation_timestamp) <= 60, 1, 0)) AS n_edits_2mo,
  SUM(IF(revision_text_bytes_diff > 0 AND DATEDIFF(event_timestamp, page_creation_timestamp) <= 60, 1, 0)) AS n_additions_2mo
FROM wmf.mediawiki_history
WHERE snapshot = '2018-12'
  AND wiki_db = 'commonswiki'
  AND event_entity = 'revision'
  AND page_namespace = 6
  AND NOT revision_is_identity_revert -- don't count edits that are reverts
  AND NOT revision_is_identity_reverted -- don't count edits that were reverted
  AND NOT revision_is_deleted -- don't counts edits moved to archive table
  AND page_id IS NOT NULL -- don't count deleted files

I don't understand the point of this, since the NOT revision_is_deleted should have already removed deleted files. (Also the page_id isn't necessarily null for deleted pages; after all the MediaWiki archive table has ar_page_id.)

 GROUP BY page_id, TO_DATE(page_creation_timestamp)
)
SELECT
 creation_date,
 COUNT(1) AS n_total, -- files uploaded
 SUM(IF(n_edits_total > 0, 1, 0)) AS n_edited, -- files that have had metadata edited
 SUM(IF(n_additions_total > 0, 1, 0)) AS n_added_to, -- files that have had metadata added
 SUM(IF(n_edits_2mo > 0, 1, 0)) AS n_edited_2mo, -- files that have had metadata edited in first 2 months
 SUM(IF(n_additions_2mo > 0, 1, 0)) AS n_added_to_2mo -- files that have had metadata added in first 2 months
 FROM summarized_revisions
GROUP BY creation_date;TASK DETAILhttps://phabricator.wikimedia.org/T213597EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Neil_P._Quinn_WMFCc: Neil_P._Quinn_WMF, chelsyx, MNeisler, mpopov, kzimmerman, Ramsey-WMF, Abit, JKSTNK, Lahi, PDrouin-WMF, E1presidente, Cparle, Anooprao, SandraF_WMF, Tramullas, Acer, Silverfish, Susannaanas, Jane023, Wikidata-bugs, Base, matthiasmullie, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Lowered Priority] T177357: Metrics for SDoC: future work of interest (templates and licensing)

2018-09-27 Thread Neil_P._Quinn_WMF
Neil_P._Quinn_WMF lowered the priority of this task from "High" to "Normal".
TASK DETAILhttps://phabricator.wikimedia.org/T177357EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Neil_P._Quinn_WMFCc: Neil_P._Quinn_WMF, JKatzWMF, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Nandana, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, Anooprao, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Raised Priority] T177357: Metrics for SDoC: future work of interest (templates and licensing)

2018-09-27 Thread Neil_P._Quinn_WMF
Neil_P._Quinn_WMF raised the priority of this task from "Normal" to "High".
TASK DETAILhttps://phabricator.wikimedia.org/T177357EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Neil_P._Quinn_WMFCc: Neil_P._Quinn_WMF, JKatzWMF, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Nandana, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, Anooprao, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177357: Metrics for SDoC: future work of interest (templates and licensing)

2018-05-03 Thread Neil_P._Quinn_WMF
Neil_P._Quinn_WMF added a comment.
For reference, I did some work counting the number of Commons files with different CC licenses:TASK DETAILhttps://phabricator.wikimedia.org/T177357EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Neil_P._Quinn_WMFCc: Neil_P._Quinn_WMF, JKatzWMF, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T69434: Make Hovercards work on Wikidata item links

2015-07-16 Thread Neil_P._Quinn_WMF
Neil_P._Quinn_WMF moved this task to To Do on the Hovercards workboard.

TASK DETAIL
  https://phabricator.wikimedia.org/T69434

WORKBOARD
  https://phabricator.wikimedia.org/project/board/765/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Bene, Neil_P._Quinn_WMF
Cc: Quiddity, Raymond, Bene, Ricordisamoa, SamB, Aklapper, Se4598, 
JanZerebecki, Wikidata-bugs, Jdforrester-WMF, Vibhabamba, Prtksxna, 
Liuxinyu970226, Lydia_Pintscher, hoo, aude, Malyacko, P.Copp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T69434: Make Hovercards work on Wikidata item links

2015-07-16 Thread Neil_P._Quinn_WMF
Neil_P._Quinn_WMF moved this task to Nice to have on the Hovercards workboard.

TASK DETAIL
  https://phabricator.wikimedia.org/T69434

WORKBOARD
  https://phabricator.wikimedia.org/project/board/765/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Bene, Neil_P._Quinn_WMF
Cc: Quiddity, Raymond, Bene, Ricordisamoa, SamB, Aklapper, Se4598, 
JanZerebecki, Wikidata-bugs, Jdforrester-WMF, Vibhabamba, Prtksxna, 
Liuxinyu970226, Lydia_Pintscher, hoo, aude, Malyacko, P.Copp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs