[Wikidata-bugs] [Maniphest] [Unblock] T193691: As a user of the Wikipedia app, I would like to be able to add or edit title descriptions from the app (eg. Wikidata descriptions)
chelsyx closed subtask T203723: As a product analyst I would like to know how people are using the Wikidata Descriptions editing features as Resolved. TASK DETAIL https://phabricator.wikimedia.org/T193691 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Mhurd, chelsyx Cc: ABorbaWMF, Sjoerddebruin, JMinor, PDrouin-WMF, Aklapper, Mhurd, cmadeo, Ddurigon, alaa_wmde, Mateo1977, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, Taquo, LawExplorer, catalandres, _jensen, rosalieper, Karthik_sripal, Wikidata-bugs, aude, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T182849: Identify unhelpful file names on commons
chelsyx added a comment. A first try using logistic regression: https://paws-public.wmflabs.org/paws-public/User:CXie_(WMF)/commons_file_names.ipynbTASK DETAILhttps://phabricator.wikimedia.org/T182849EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: diego, Base, Liuxinyu970226, thiemowmde, Aklapper, Abit, Ramsey-WMF, mpopov, chelsyx, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, _jensen, Susannaanas, Jane023, Wikidata-bugs, matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T204415: Query stats dashboard not updating
chelsyx added a project: Analytics. TASK DETAILhttps://phabricator.wikimedia.org/T204415EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, mpopov, chelsyx, Aklapper, Addshore, Smalyshev, Lydia_Pintscher, Akovalyov, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Jonas, Xmlizer, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331, jeremyb___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Changed Subscribers] T204415: Query stats dashboard not updating
chelsyx added a subscriber: Nuria.chelsyx added a comment. Hi @Nuria we noticed that since August 10th, the SPARQL usage number is very small (see query in T204415#4590108), which is much less than what we saw in logstash: https://logstash.wikimedia.org/goto/74e376f55fcdc3b93e4a7232cfa5203a Do you know any incident of webrequest that might cause this?TASK DETAILhttps://phabricator.wikimedia.org/T204415EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, mpopov, chelsyx, Aklapper, Addshore, Smalyshev, Lydia_Pintscher, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T204415: Query stats dashboard not updating
chelsyx added a comment. Hi @Smalyshev , the dashboard is updating. But since August 10th, the SPARQL usage number is very small (even 0 for certain days) and the LDF usage number is 0. Did we change the URI of the endpoint? Query: sql SELECT year, month, day, IF(uri_path = '/sparql', '/bigdata/namespace/wdq/sparql', uri_path) AS path, UPPER(http_status IN('200','304')) as http_success, CASE WHEN ( agent_type = 'user' AND ( user_agent RLIKE 'https?://' OR INSTR(user_agent, 'www.') > 0 OR INSTR(user_agent, 'github') > 0 OR LOWER(user_agent) RLIKE '([a-z0-9._%-]+@[a-z0-9.-]+\.(com|us|net|org|edu|gov|io|ly|co|uk))' OR ( user_agent_map['browser_family'] = 'Other' AND user_agent_map['device_family'] = 'Other' AND user_agent_map['os_family'] = 'Other' ) ) ) OR agent_type = 'spider' THEN 'TRUE' ELSE 'FALSE' END AS is_automata, COUNT(*) AS events FROM webrequest WHERE webrequest_source = 'misc' AND year = 2018 AND month = 8 AND day > 9 AND uri_host = 'query.wikidata.org' AND uri_path IN('/', '/bigdata/namespace/wdq/sparql', '/bigdata/ldf', '/sparql') GROUP BY year, month, day, IF(uri_path = '/sparql', '/bigdata/namespace/wdq/sparql', uri_path), UPPER(http_status IN('200','304')), CASE WHEN ( agent_type = 'user' AND ( user_agent RLIKE 'https?://' OR INSTR(user_agent, 'www.') > 0 OR INSTR(user_agent, 'github') > 0 OR LOWER(user_agent) RLIKE '([a-z0-9._%-]+@[a-z0-9.-]+\.(com|us|net|org|edu|gov|io|ly|co|uk))' OR ( user_agent_map['browser_family'] = 'Other' AND user_agent_map['device_family'] = 'Other' AND user_agent_map['os_family'] = 'Other' ) ) ) OR agent_type = 'spider' THEN 'TRUE' ELSE 'FALSE' END ORDER BY year, month, day LIMIT 1TASK DETAILhttps://phabricator.wikimedia.org/T204415EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: mpopov, chelsyx, Aklapper, Addshore, Smalyshev, Lydia_Pintscher, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Claimed] T182849: Identify unhelpful file names on commons
chelsyx claimed this task. TASK DETAILhttps://phabricator.wikimedia.org/T182849EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Base, Liuxinyu970226, thiemowmde, Aklapper, Abit, Ramsey-WMF, mpopov, chelsyx, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Closed] T177534: Search Metrics for SDoC: eventlogging
chelsyx closed this task as "Resolved". TASK DETAILhttps://phabricator.wikimedia.org/T177534EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: mpopov, chelsyx, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Ramsey-WMF, Cparle, Darkminds3113, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Unblock] T174519: [epic] SDoC: Determine baseline for metrics
chelsyx closed subtask T177534: Search Metrics for SDoC: eventlogging as "Resolved". TASK DETAILhttps://phabricator.wikimedia.org/T174519EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Capt_Swing, Ramsey-WMF, SandraF_WMF, Abit, chelsyx, mpopov, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Closed] T179450: Documentation of SDoC findings
chelsyx closed this task as "Resolved". TASK DETAILhttps://phabricator.wikimedia.org/T179450EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Keegan, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Capt_Swing, Liuxinyu970226, debt, Nuria, Ramsey-WMF, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Closed] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx closed this task as "Resolved". TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: PDrouin-WMF, EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Unblock] T174519: [epic] SDoC: Determine baseline for metrics
chelsyx closed subtask T179450: Documentation of SDoC findings as "Resolved". TASK DETAILhttps://phabricator.wikimedia.org/T174519EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Capt_Swing, Ramsey-WMF, SandraF_WMF, Abit, chelsyx, mpopov, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Unblock] T174519: [epic] SDoC: Determine baseline for metrics
chelsyx closed subtask T177353: Metrics for SDoC: look at search hits based on which element the search is hitting as "Resolved". TASK DETAILhttps://phabricator.wikimedia.org/T174519EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Capt_Swing, Ramsey-WMF, SandraF_WMF, Abit, chelsyx, mpopov, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Closed] T174519: [epic] SDoC: Determine baseline for metrics
chelsyx closed this task as "Resolved".chelsyx claimed this task. TASK DETAILhttps://phabricator.wikimedia.org/T174519EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Capt_Swing, Ramsey-WMF, SandraF_WMF, Abit, chelsyx, mpopov, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Changed Project Column] T174519: [epic] SDoC: Determine baseline for metrics
chelsyx moved this task from Needs review to Done on the Discovery-Analysis (Current work) board.chelsyx added a comment. Thank you @Ramsey-WMF ! :DTASK DETAILhttps://phabricator.wikimedia.org/T174519WORKBOARDhttps://phabricator.wikimedia.org/project/board/1241/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Capt_Swing, Ramsey-WMF, SandraF_WMF, Abit, chelsyx, mpopov, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T174519: [epic] SDoC: Determine baseline for metrics
chelsyx added a comment. @Ramsey-WMF Is there any feedback about the baseline metrics from the team? Could we resolve this ticket and other child tickets?TASK DETAILhttps://phabricator.wikimedia.org/T174519EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Capt_Swing, Ramsey-WMF, SandraF_WMF, Abit, chelsyx, mpopov, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Changed Project Column] T179450: Documentation of SDoC findings
chelsyx moved this task from In progress to Needs review on the Discovery-Analysis (Current work) board.chelsyx added a comment. Done: https://meta.wikimedia.org/wiki/Research:Baseline_Metrics_for_Structured_Data_on_Wikimedia_CommonsTASK DETAILhttps://phabricator.wikimedia.org/T179450WORKBOARDhttps://phabricator.wikimedia.org/project/board/1241/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Capt_Swing, Liuxinyu970226, debt, Nuria, Ramsey-WMF, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Changed Project Column] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx moved this task from In progress to Needs review on the Discovery-Analysis (Current work) board.chelsyx added a comment. All results and analysis codebase can be found here: https://github.com/wikimedia-research/SDoC-Initial-Metrics/tree/master/T177353 For unhelpful file names, I created a child ticket T182849 since it should be a separate project and we don't have the bandwidth to deal with it now.TASK DETAILhttps://phabricator.wikimedia.org/T177353WORKBOARDhttps://phabricator.wikimedia.org/project/board/1241/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: PDrouin-WMF, EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx added a comment. Categorization Excluding hidden categories and 'needing_category' categories, there are 1,629,592 (3.73%) files that don't belong to any category, 22,492,880 (51.55%) files belong to only 1 category as of December 12, 2017. F11832678: nfile_by_categories.png Breakdown by media type and analysis codebase can be found here: https://github.com/wikimedia-research/SDoC-Initial-Metrics/tree/master/T177353 If you find the number here is conflict with T177353#3743257, that's because files with 'needing_category' categories may actually have other categories at the same time -- possibly because users add categories to a file but forgot to remove 'needing_category', or the 'needing_category' got moved to hidden categories. The graph above shows a more accurate count.TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: PDrouin-WMF, EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Triaged] T182849: Identify unhelpful file names on commons
chelsyx triaged this task as "Low" priority. TASK DETAILhttps://phabricator.wikimedia.org/T182849EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: thiemowmde, Aklapper, Abit, Ramsey-WMF, mpopov, chelsyx, Lahi, PDrouin-WMF, Gq86, E1presidente, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx added a subtask: T182849: Identify unhelpful file names on commons. TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: PDrouin-WMF, EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T182849: Identify unhelpful file names on commons
chelsyx added a parent task: T177353: Metrics for SDoC: look at search hits based on which element the search is hitting. TASK DETAILhttps://phabricator.wikimedia.org/T182849EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: thiemowmde, Aklapper, Abit, Ramsey-WMF, mpopov, chelsyx, Lahi, PDrouin-WMF, Gq86, E1presidente, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T182849: Identify unhelpful file names on commons
chelsyx added a comment. Hello @thiemowmde ! The purpose of T177353 and its parent ticket T174519: [epic] SDoC: Determine baseline for metrics is to figure out a baseline for metrics on Commons in order to measure future successes for the #structured-data-commons (SDoC) project. The SDoC team and us (#discovery-analysis) came up with a list of stuff that would be interesting to measure, and create T177353 and other child tickets (see T174519 for more details). There is a exploratory nature in this work: some metrics in the list are clearly defined, while some -- for example, what is the exact meaning of "unhelpful" -- are not. Any ideas and comments are very welcome! The Titleblacklist is used to block certain file names (generic, spam, etc.) through mw:Extension:Title blacklist when users try to upload files with these invalid names. However, regular _expression_ is not perfect and there are still some files with "unhelpful" names got uploaded -- e.g. File:Img-071129152243-0001.png and those in the move log whose change reason is meaningless or ambiguous, which now requires human to identify. That's why I'm thinking about using a machine learning model to help identify these files.TASK DETAILhttps://phabricator.wikimedia.org/T182849EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: thiemowmde, Aklapper, Abit, Ramsey-WMF, mpopov, chelsyx, Lahi, PDrouin-WMF, Gq86, E1presidente, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Claimed] T179450: Documentation of SDoC findings
chelsyx claimed this task.chelsyx moved this task from Backlog to In progress on the Discovery-Analysis (Current work) board. TASK DETAILhttps://phabricator.wikimedia.org/T179450WORKBOARDhttps://phabricator.wikimedia.org/project/board/1241/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Capt_Swing, Liuxinyu970226, debt, Nuria, Ramsey-WMF, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Created] T182849: Identify unhelpful file names on commons
chelsyx created this task.chelsyx added projects: Structured-Data-Commons, Discovery-Analysis.Herald added a subscriber: Aklapper.Herald added a project: Wikidata. TASK DESCRIPTIONIn T177353, we were asked to get a count of files with unhelpful names. To identify unhelpful file names, we can extract the old and new file names from the move log whose change reason is meaningless or ambiguous, and then train a classification model. Putting this project in the backlog now. I will pick it up when we have some bandwidth.TASK DETAILhttps://phabricator.wikimedia.org/T182849EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, Abit, Ramsey-WMF, mpopov, chelsyx, Lahi, PDrouin-WMF, Gq86, E1presidente, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T177358: Metrics for SDoC: translations
chelsyx updated the task description. (Show Details) CHANGES TO TASK DESCRIPTION...* [x] how many files/descriptions are in multiple languages?...** [x] How many files are in lang X? ** [x] How many have multiple languages in them? ** [x] How many Western industrialized languages?...TASK DETAILhttps://phabricator.wikimedia.org/T177358EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: mpopov, chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177358: Metrics for SDoC: translations
chelsyx added a comment. We parsed the wikitext of all files in Commons xml data dumps of November 20, 2017, and extract the language templates in them (e.g. {{en}}, {{LangSwitch}}). Out of the total 43,268,565 files, 14,848,551 (34.32%) files don't have any language templates, 23,780,247 (54.96%) files use only 1 language. F11792338: files_by_n_languages.png 40.1% of all files have English templates, 9.38% of files use German, and 6.2% of files have description in languages which are not in the top 20. F11792361: top20_languages_nfiles.png For those files without language template, we use the langdetect package to detect their languages. We cannot detect any language in 556,684 files (1.29% of all 43,268,565 files). We detect 1 language for 7,577,789 (17.51%) files. F11795099: files_by_n_detected_languages.png We detect English in 30.25% of all 43,268,565 files, detect German in 3.93% of files. F11795155: top20_detected_languages_nfiles.pngTASK DETAILhttps://phabricator.wikimedia.org/T177358EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: mpopov, chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Blocker] T174519: [epic] SDoC: Determine baseline for metrics
chelsyx changed the status of subtask T177353: Metrics for SDoC: look at search hits based on which element the search is hitting from "Stalled" to "Open". TASK DETAILhttps://phabricator.wikimedia.org/T174519EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Capt_Swing, Ramsey-WMF, SandraF_WMF, Abit, chelsyx, mpopov, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Changed Status] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx changed the task status from "Stalled" to "Open".chelsyx raised the priority of this task from "Low" to "Normal".chelsyx added a comment. We parsed the wikitext of all files in Commons xml data dumps of November 20, 2017. Out of the total 43,268,565 files, 41,796,560 (96.6%) files have a infobox, 41,309,028 (95.47%) have some contents in their description fields (description, title, depicted people, depicted place, etc). Caveat: There are a large number of infobox-like templates (e.g. Infobox_templates:_based_on_Information_template, Data_ingestion_layout_templates, templates only for one batch of uploads like this) with description fields of various names (e.g. some use commons_description instead of description). This makes counting very difficult because we cannot enumerate all of these infobox names and description field names. Some users create their own templates on top of other infobox templates for upload convenience. This makes the file description masked -- they cannot be search. For example, the wikitext of File:Cyclopaedia, Chambers - Volume 1 - 0133.jpg is: {{Cyclopaedia, Chambers page | volume = 1 | prev = 0132 | page = 0133 | next = 0134 }} A lot of the information we see on the web page is actually hidden in its template Template:Cyclopaedia,_Chambers_page. This makes it very hard to find this file through search, because search is done through the above shown wikitext of this file. We should encourage our users to clean up this kind of templates.TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx edited projects, added Discovery-Analysis (Current work); removed Discovery-Analysis. TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Claimed] T177358: Metrics for SDoC: translations
chelsyx claimed this task.chelsyx moved this task from Backlog to In progress on the Discovery-Analysis (Current work) board. TASK DETAILhttps://phabricator.wikimedia.org/T177358WORKBOARDhttps://phabricator.wikimedia.org/project/board/1241/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Changed Project Column] T177534: Search Metrics for SDoC: eventlogging
chelsyx moved this task from In progress to Needs review on the Discovery-Analysis (Current work) board.chelsyx added a comment. We computed several search metrics with event logging data in November 2017, and compare them with English Wikipedia. They are searches on desktop only, since we have very few searches on mobile web on Commons (less than 100 search result pages daily). The zero results rate for full-text search is slightly lower on Commons compared to English Wikipedia: F11091871: zrr_all.png However, the clickthrough rate for full-text search is much lower than English Wikipedia, only 10.42%: F11091881: ctr_all.png Also, users on Commons are much more likely to click to see other pages of search results: F11091886: serp_offset_all.png See https://github.com/wikimedia-research/SDoC-Initial-Metrics/tree/master/T177534 for more results.TASK DETAILhttps://phabricator.wikimedia.org/T177534WORKBOARDhttps://phabricator.wikimedia.org/project/board/1241/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: mpopov, chelsyx, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Ramsey-WMF, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Claimed] T177534: Metrics for SDoC: eventlogging
chelsyx claimed this task.chelsyx edited projects, added Discovery-Analysis (Current work); removed Discovery-Analysis. TASK DETAILhttps://phabricator.wikimedia.org/T177534EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: mpopov, chelsyx, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Ramsey-WMF, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx added a comment. Status of tasks of this ticket: Search hits based on which element the search is hitting: file name vs. description vs. category This is not feasible currently. Possible solution is T177353#3716344, and we will need help from search backend team. "Unfindable" images metrics: lack of categorization, unhelpful file name, no description (or poor description) Categories: The number of files having a "needing categories" category and the breakdown is shown on T177353#3743257. We have a query to count the number of files by the number of categories, category type (hidden vs not) and media type. But we are having some problems when using this query on mysql database. Possible solution is available, but it would take some time. Description: We could use advanced search and/or parse the page content with hive (using a experimental table set up by analytics), but it would take some time. File name: We could get this done by machine learning as described in T177353#3712897, but it would take some time to train and tune the model. Investigate file annotations and if any tracking (logging) of them are available Done. See T177353#3711572 Given the difficulties we are facing as described above, @debt and I decide to put this ticket to backlog and work on other SDoC metrics first.TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx added a comment. On November 7, the number of files having a "needing categories" category is 4,268,386 (10%). The following table break down the counts by media type: img_media_typeneed_catn_filesproportion bitmapno3617694184.47% bitmapyes42072329.82% drawingno11673892.73% drawingyes177440.04% audiono7922231.85% audioyes26250.01% videono719440.17% videoyes366130.09% multimediano40% officeno3510350.82% officeyes41720.01% TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx added a comment. In T177353#3714007, @debt wrote: Oh, that looks like that will be quite interesting, @chelsyx, although it looks like it might be a bit of manual work involved. Getting data from the move log is easy, but it will take some time to train and adjust the model. @debt @Ramsey-WMF Let me know if you want me to spend time on getting other metrics done rather than this.TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx added a comment. In T177353#3716995, @debt wrote: Great idea, @EBernhardson, let's do it! @chelsyx can you get that sampling from the data we already have? @debt Yes, I can get those queries from TestSearchSatisfaction2 table. We will need help from @EBernhardson to run them against test cluster and check the results.TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx added a comment. For unhelpful file names, I want to extract the old and new file names from the move log whose change reason is meaningless or ambiguous, and then train a model to classify these file names. As far as I know, short text classification like this is a bit tricky.. @mpopov do you have any suggestion?TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx updated the task description. (Show Details) CHANGES TO TASK DESCRIPTION...* [x] investigate file annotations and if any tracking (logging) of them are available...TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx updated the task description. (Show Details) CHANGES TO TASK DESCRIPTION...** After talking with @EBernhardson , we decided this is not feasible since we don't record this information nowTASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx added a comment. There are 142,994 files with annotations (ImageNote), follow this link for the most current count. The revision history of annotations are there, along with other page revision history, for example: https://commons.wikimedia.org/w/index.php?title=File:Henley_2009_women.jpg=""> @Ramsey-WMF Is this what you want?TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx added a subscriber: EBernhardson.chelsyx updated the task description. (Show Details) CHANGES TO TASK DESCRIPTION...* [x] file name vs. description vs. category * [] "Unfindable" images metrics * []After talking with @EBernhardson , we decided this is not feasible since we don't record this information now. * [] "Unfindable" images metrics: lack of categorization, unhelpful file name, no description (or poor description)...TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions
chelsyx added a comment. Good idea! Thanks @Nuria !TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions
chelsyx added a comment. Hi @Nuria , the numbers I showed above are cumulative sum at the end of each month, while the numbers you talked about are newly uploads for each month. From my query, for Dec 2016, the number of newly uploaded files by bots are 392,566, by users = 392,786. This is closed to what is shown on https://stats.wikimedia.org/wikispecial/EN/TablesWikipediaCOMMONS.htm. I think the differences came from two sources: 1, I assume the numbers on https://stats.wikimedia.org/wikispecial/EN/TablesWikipediaCOMMONS.htm are computed at the end of each month and files could be deleted afterwards. For the numbers above, I used the image table and only counts the files that are still there on Oct 12, 2017. 2, According to commons bots, not all accounts being operated as bots has a bot flag, so I also include accounts with the keywords "bot_flag" or "bots" (see the query below). Query for counting newly uploaded files on commons: SELECT LEFT(img_timestamp, 6) AS yr_month, user_group, COUNT(*) AS n_files FROM ( -- Get active/inactive bots SELECT ug_user AS user_id, ug_group AS user_group FROM user_groups WHERE ug_group = 'bot' UNION SELECT ufg_user AS user_id, ufg_group AS user_group FROM user_former_groups WHERE ufg_group = 'bot' UNION -- Get user ids with bot categories in their user pages SELECT user.user_id, 'bot' AS user_group FROM user INNER JOIN ( -- all user page names with bot category SELECT REPLACE(page.page_title, '_', ' ') AS user_name FROM page INNER JOIN ( -- page ids with bot categories SELECT DISTINCT cl_from AS page_id FROM categorylinks WHERE cl_to REGEXP '_(bot_flag|bots)(_|$)' AND cl_type = 'page' ) AS bot_cat ON page.page_id=bot_cat.page_id WHERE page_namespace = 2 ) AS bot_name ON user.user_name=bot_name.user_name ) AS bots RIGHT JOIN image ON bots.user_id = image.img_user GROUP BY LEFT(img_timestamp, 6), user_group;TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Claimed] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting
chelsyx claimed this task.chelsyx edited projects, added Discovery-Analysis (Current work); removed Discovery-Analysis. TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions
chelsyx added a comment. Codebase and output: https://github.com/wikimedia-research/SDoC-Initial-Metrics/tree/master/T177354TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Liuxinyu970226, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions
chelsyx added a comment. @mpopov yup, I will put my stuff in the repo.TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T177354: Metrics for SDoC: look at contributions
chelsyx updated the task description. (Show Details) CHANGES TO TASK DESCRIPTION...* [x] individuals * [x] mass-tools/institutions * [x] number of contributions as of present time * [x] compare to what it looked like 30 days agoTASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions
chelsyx added a comment. The following two graphs breakdown the number by month: F10169825: nfile_bot_month.png F10169827: nfile_bot_month_prop.pngTASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions
chelsyx added a comment. Updated: On Oct 12, 2017, the number of files uploaded by bots is 9,390,721 (22.03%), and the number of files uploaded by users is 33,241,541 (77.97%). The following table break down the counts by media type: Media TypeUser GroupNumber of FilesProportion bitmapuser3135534373.55% bitmapbot884344720.74% drawinguser9059642.13% drawingbot2705160.63% audiouser6985661.64% audiobot956460.22% videouser717380.17% videobot363290.09% multimediauser40% officeuser2099260.49% officebot1447830.34% TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions
chelsyx added a comment. @mpopov Looks like the file type categorization on commons is messier than we thought... For example, File:Krazy_Kat_Bugolist_1916_silent.ogv is an ogv file, but its img_minor_mime is ogg, img_major_mime is application, and img_media_type is video. This is the same for other ogv files. While for ogg files like File:Whitenoisesound.ogg, its img_minor_mime is ogg, img_major_mime is application, and img_media_type is audio. Not sure if the field img_media_type is more trustworthy...TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions
chelsyx added a comment. Hey @chelsyx - what time frame does this cover? Jumping in to say this looks like it's from launch of Commons to now. Thanks @mpopov ! Yes, this is the file counts on Oct 10. Can we also get a count of how this has changed over the last week and compare that to the last 30 days? It'd be interesting to see if the numbers are fairly consistent (individual vs institution) or if they have changed quite a bit when extending the time scope. @chelsyx this may be useful: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits as it contains monthly snapshots of the page & user tables as of April 2017 Unfortunately, the mediawiki snapshot doesn't has the image table which describes images and other uploaded files.TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Changed Project Column] T177354: Metrics for SDoC: look at contributions
chelsyx moved this task from In progress to Needs review on the Discovery-Analysis (Current work) board.chelsyx added a comment. The number of files uploaded by bots is 9,390,408 (22.04%), and the number of files uploaded by users is 33,222,828 (77.96%). The following table break down the counts by media type: img_major_mimeuser_groupn_files applicationuser927448 applicationbot273617 audiouser12479 audiobot2206 imageuser32242778 imagebot9113650 videouser40133 videobot935 Query: SELECT img_major_mime, user_group, COUNT(*) AS n_files FROM ( -- Get active/inactive bots SELECT ug_user AS user_id, ug_group AS user_group FROM user_groups WHERE ug_group = 'bot' UNION SELECT ufg_user AS user_id, ufg_group AS user_group FROM user_former_groups WHERE ufg_group = 'bot' UNION -- Get user ids with bot categories in their user pages SELECT user.user_id, 'bot' AS user_group FROM user INNER JOIN ( -- all user page names with bot category SELECT REPLACE(page.page_title, '_', ' ') AS user_name FROM page INNER JOIN ( -- page ids with bot categories SELECT DISTINCT cl_from AS page_id FROM categorylinks WHERE cl_to REGEXP '_(bot_flag|bots)(_|$)' AND cl_type = 'page' ) AS bot_cat ON page.page_id=bot_cat.page_id WHERE page_namespace = 2 ) AS bot_name ON user.user_name=bot_name.user_name ) AS bots RIGHT JOIN image ON bots.user_id = image.img_user GROUP BY img_major_mime, user_group;TASK DETAILhttps://phabricator.wikimedia.org/T177354WORKBOARDhttps://phabricator.wikimedia.org/project/board/1241/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Changed Project Column] T177354: Metrics for SDoC: look at contributions
chelsyx moved this task from Needs triage to Current work on the Discovery-Analysis board.chelsyx edited projects, added Discovery-Analysis (Current work); removed Discovery-Analysis. TASK DETAILhttps://phabricator.wikimedia.org/T177354WORKBOARDhttps://phabricator.wikimedia.org/project/board/1850/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Izno, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Claimed] T177354: Metrics for SDoC: look at contributions
chelsyx claimed this task. TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Izno, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries
chelsyx added a comment. Thank you @debt! :)TASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries
chelsyx added a comment. Thanks @debt! Updated on Commons!TASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries
chelsyx added a comment. Modified: F4553759: report.pdf @debt Please let me know if there is anything else need to be changed.TASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries
chelsyx added a comment. Thanks everyone! I've uploaded the report to the commons: https://commons.wikimedia.org/wiki/File:Exploration_on_the_Use_of_WDQS_-_Breakdown_by_Geography,_User_Agent_and_Referer_Class.pdfTASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries
chelsyx added a comment. @Smalyshev what do you mean by "error responses"? Here is an example of my query: SELECT CONCAT(year,'-',month,'-',day) AS dt, PERCENTILE_APPROX(time_firstbyte, 0.5) AS median_time_firstbyte, PERCENTILE(response_size, 0.5) AS median_response_size FROM webrequest WHERE year = 2016 AND month = 07 AND day = 01 AND webrequest_source = 'misc' AND uri_host = 'query.wikidata.org' AND uri_path = '/bigdata/namespace/wdq/sparql' AND http_status IN('200','304') AND INSTR(uri_query, '?query=') > 0 GROUP BY CONCAT(year,'-',month,'-',day);TASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries
chelsyx added a comment. Updated Reviewers: F4537643: report.pdf @debt and @Smalyshev, your suggestions are very welcome!!! :)TASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries
chelsyx added a comment. 3rd draft: F4487819: report.pdfTASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries
chelsyx added a comment. Second Draft: F4452046: report.pdfTASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries
chelsyx added a comment. First draft of the report: F4420629: report.pdf I put a lot of stuff into report. However, because of my lack of domain knowledge, I don't have a very clear idea about what question is meaningful/useful to answer. So any suggestion is very welcome!!!TASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, Smalyshev, debt, mschwarzer, MelodyKramer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs