[Wikidata-bugs] [Maniphest] [Unblock] T193691: As a user of the Wikipedia app, I would like to be able to add or edit title descriptions from the app (eg. Wikidata descriptions)

2019-04-16 Thread chelsyx
chelsyx closed subtask T203723: As a product analyst I would like to know how 
people are using the Wikidata Descriptions editing features as 
Resolved.

TASK DETAIL
  https://phabricator.wikimedia.org/T193691

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mhurd, chelsyx
Cc: ABorbaWMF, Sjoerddebruin, JMinor, PDrouin-WMF, Aklapper, Mhurd, cmadeo, 
Ddurigon, alaa_wmde, Mateo1977, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, Taquo, LawExplorer, catalandres, _jensen, rosalieper, Karthik_sripal, 
Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T182849: Identify unhelpful file names on commons

2019-02-18 Thread chelsyx
chelsyx added a comment.
A first try using logistic regression: https://paws-public.wmflabs.org/paws-public/User:CXie_(WMF)/commons_file_names.ipynbTASK DETAILhttps://phabricator.wikimedia.org/T182849EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: diego, Base, Liuxinyu970226, thiemowmde, Aklapper, Abit, Ramsey-WMF, mpopov, chelsyx, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, _jensen, Susannaanas, Jane023, Wikidata-bugs, matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T204415: Query stats dashboard not updating

2018-09-20 Thread chelsyx
chelsyx added a project: Analytics.
TASK DETAILhttps://phabricator.wikimedia.org/T204415EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, mpopov, chelsyx, Aklapper, Addshore, Smalyshev, Lydia_Pintscher, Akovalyov, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Jonas, Xmlizer, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Subscribers] T204415: Query stats dashboard not updating

2018-09-18 Thread chelsyx
chelsyx added a subscriber: Nuria.chelsyx added a comment.
Hi @Nuria we noticed that since August 10th, the SPARQL usage number is very small (see query in T204415#4590108), which is much less than what we saw in logstash: https://logstash.wikimedia.org/goto/74e376f55fcdc3b93e4a7232cfa5203a
Do you know any incident of webrequest that might cause this?TASK DETAILhttps://phabricator.wikimedia.org/T204415EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, mpopov, chelsyx, Aklapper, Addshore, Smalyshev, Lydia_Pintscher, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T204415: Query stats dashboard not updating

2018-09-17 Thread chelsyx
chelsyx added a comment.
Hi @Smalyshev , the dashboard is updating. But since August 10th, the SPARQL usage number is very small (even 0 for certain days) and the LDF usage number is 0. Did we change the URI of the endpoint?

Query:

sql
SELECT
  year, month, day,
  IF(uri_path = '/sparql', '/bigdata/namespace/wdq/sparql', uri_path) AS path,
  UPPER(http_status IN('200','304')) as http_success,
  CASE
WHEN (
  agent_type = 'user' AND (
user_agent RLIKE 'https?://'
OR INSTR(user_agent, 'www.') > 0
OR INSTR(user_agent, 'github') > 0
OR LOWER(user_agent) RLIKE '([a-z0-9._%-]+@[a-z0-9.-]+\.(com|us|net|org|edu|gov|io|ly|co|uk))'
OR (
  user_agent_map['browser_family'] = 'Other'
  AND user_agent_map['device_family'] = 'Other'
  AND user_agent_map['os_family'] = 'Other'
  )
)
) OR agent_type = 'spider' THEN 'TRUE'
ELSE 'FALSE' END AS is_automata,
  COUNT(*) AS events
FROM webrequest
WHERE
  webrequest_source = 'misc'
  AND year = 2018 AND month = 8 AND day > 9
  AND uri_host = 'query.wikidata.org'
  AND uri_path IN('/', '/bigdata/namespace/wdq/sparql', '/bigdata/ldf', '/sparql')
GROUP BY
  year, month, day,
  IF(uri_path = '/sparql', '/bigdata/namespace/wdq/sparql', uri_path),
  UPPER(http_status IN('200','304')),
  CASE
WHEN (
  agent_type = 'user' AND (
user_agent RLIKE 'https?://'
OR INSTR(user_agent, 'www.') > 0
OR INSTR(user_agent, 'github') > 0
OR LOWER(user_agent) RLIKE '([a-z0-9._%-]+@[a-z0-9.-]+\.(com|us|net|org|edu|gov|io|ly|co|uk))'
OR (
  user_agent_map['browser_family'] = 'Other'
  AND user_agent_map['device_family'] = 'Other'
  AND user_agent_map['os_family'] = 'Other'
  )
)
) OR agent_type = 'spider' THEN 'TRUE'
ELSE 'FALSE' END
ORDER BY year, month, day
LIMIT 1TASK DETAILhttps://phabricator.wikimedia.org/T204415EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: mpopov, chelsyx, Aklapper, Addshore, Smalyshev, Lydia_Pintscher, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Claimed] T182849: Identify unhelpful file names on commons

2018-05-10 Thread chelsyx
chelsyx claimed this task.
TASK DETAILhttps://phabricator.wikimedia.org/T182849EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Base, Liuxinyu970226, thiemowmde, Aklapper, Abit, Ramsey-WMF, mpopov, chelsyx, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Closed] T177534: Search Metrics for SDoC: eventlogging

2018-02-07 Thread chelsyx
chelsyx closed this task as "Resolved".
TASK DETAILhttps://phabricator.wikimedia.org/T177534EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: mpopov, chelsyx, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Ramsey-WMF, Cparle, Darkminds3113, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Unblock] T174519: [epic] SDoC: Determine baseline for metrics

2018-02-07 Thread chelsyx
chelsyx closed subtask T177534: Search Metrics for SDoC: eventlogging as "Resolved".
TASK DETAILhttps://phabricator.wikimedia.org/T174519EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Capt_Swing, Ramsey-WMF, SandraF_WMF, Abit, chelsyx, mpopov, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Closed] T179450: Documentation of SDoC findings

2018-02-07 Thread chelsyx
chelsyx closed this task as "Resolved".
TASK DETAILhttps://phabricator.wikimedia.org/T179450EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Keegan, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Capt_Swing, Liuxinyu970226, debt, Nuria, Ramsey-WMF, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Closed] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2018-02-07 Thread chelsyx
chelsyx closed this task as "Resolved".
TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: PDrouin-WMF, EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Unblock] T174519: [epic] SDoC: Determine baseline for metrics

2018-02-07 Thread chelsyx
chelsyx closed subtask T179450: Documentation of SDoC findings as "Resolved".
TASK DETAILhttps://phabricator.wikimedia.org/T174519EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Capt_Swing, Ramsey-WMF, SandraF_WMF, Abit, chelsyx, mpopov, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Unblock] T174519: [epic] SDoC: Determine baseline for metrics

2018-02-07 Thread chelsyx
chelsyx closed subtask T177353: Metrics for SDoC: look at search hits based on which element the search is hitting as "Resolved".
TASK DETAILhttps://phabricator.wikimedia.org/T174519EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Capt_Swing, Ramsey-WMF, SandraF_WMF, Abit, chelsyx, mpopov, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Closed] T174519: [epic] SDoC: Determine baseline for metrics

2018-02-07 Thread chelsyx
chelsyx closed this task as "Resolved".chelsyx claimed this task.
TASK DETAILhttps://phabricator.wikimedia.org/T174519EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Capt_Swing, Ramsey-WMF, SandraF_WMF, Abit, chelsyx, mpopov, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T174519: [epic] SDoC: Determine baseline for metrics

2018-01-11 Thread chelsyx
chelsyx moved this task from Needs review to Done on the Discovery-Analysis (Current work) board.chelsyx added a comment.
Thank you @Ramsey-WMF ! :DTASK DETAILhttps://phabricator.wikimedia.org/T174519WORKBOARDhttps://phabricator.wikimedia.org/project/board/1241/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Capt_Swing, Ramsey-WMF, SandraF_WMF, Abit, chelsyx, mpopov, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T174519: [epic] SDoC: Determine baseline for metrics

2018-01-11 Thread chelsyx
chelsyx added a comment.
@Ramsey-WMF Is there any feedback about the baseline metrics from the team? Could we resolve this ticket and other child tickets?TASK DETAILhttps://phabricator.wikimedia.org/T174519EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Capt_Swing, Ramsey-WMF, SandraF_WMF, Abit, chelsyx, mpopov, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T179450: Documentation of SDoC findings

2017-12-21 Thread chelsyx
chelsyx moved this task from In progress to Needs review on the Discovery-Analysis (Current work) board.chelsyx added a comment.
Done: https://meta.wikimedia.org/wiki/Research:Baseline_Metrics_for_Structured_Data_on_Wikimedia_CommonsTASK DETAILhttps://phabricator.wikimedia.org/T179450WORKBOARDhttps://phabricator.wikimedia.org/project/board/1241/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Capt_Swing, Liuxinyu970226, debt, Nuria, Ramsey-WMF, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2017-12-14 Thread chelsyx
chelsyx moved this task from In progress to Needs review on the Discovery-Analysis (Current work) board.chelsyx added a comment.
All results and analysis codebase can be found here: https://github.com/wikimedia-research/SDoC-Initial-Metrics/tree/master/T177353

For unhelpful file names, I created a child ticket T182849 since it should be a separate project and we don't have the bandwidth to deal with it now.TASK DETAILhttps://phabricator.wikimedia.org/T177353WORKBOARDhttps://phabricator.wikimedia.org/project/board/1241/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: PDrouin-WMF, EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2017-12-14 Thread chelsyx
chelsyx added a comment.
Categorization
Excluding hidden categories and 'needing_category' categories, there are 1,629,592 (3.73%) files that don't belong to any category, 22,492,880 (51.55%) files belong to only 1 category as of December 12, 2017.
F11832678: nfile_by_categories.png

Breakdown by media type and analysis codebase can be found here: https://github.com/wikimedia-research/SDoC-Initial-Metrics/tree/master/T177353

If you find the number here is conflict with T177353#3743257, that's because files with 'needing_category' categories may actually have other categories at the same time -- possibly because users add categories to a file but forgot to remove 'needing_category', or the 'needing_category' got moved to hidden categories. The graph above shows a more accurate count.TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: PDrouin-WMF, EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Triaged] T182849: Identify unhelpful file names on commons

2017-12-14 Thread chelsyx
chelsyx triaged this task as "Low" priority.
TASK DETAILhttps://phabricator.wikimedia.org/T182849EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: thiemowmde, Aklapper, Abit, Ramsey-WMF, mpopov, chelsyx, Lahi, PDrouin-WMF, Gq86, E1presidente, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2017-12-14 Thread chelsyx
chelsyx added a subtask: T182849: Identify unhelpful file names on commons.
TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: PDrouin-WMF, EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T182849: Identify unhelpful file names on commons

2017-12-14 Thread chelsyx
chelsyx added a parent task: T177353: Metrics for SDoC: look at search hits based on which element the search is hitting.
TASK DETAILhttps://phabricator.wikimedia.org/T182849EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: thiemowmde, Aklapper, Abit, Ramsey-WMF, mpopov, chelsyx, Lahi, PDrouin-WMF, Gq86, E1presidente, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T182849: Identify unhelpful file names on commons

2017-12-14 Thread chelsyx
chelsyx added a comment.
Hello @thiemowmde ! The purpose of T177353 and its parent ticket T174519: [epic] SDoC: Determine baseline for metrics is to figure out a baseline for metrics on Commons in order to measure future successes for the #structured-data-commons (SDoC) project. The SDoC team and us (#discovery-analysis) came up with a list of stuff that would be interesting to measure, and create T177353 and other child tickets (see T174519 for more details). There is a exploratory nature in this work: some metrics in the list are clearly defined, while some -- for example, what is the exact meaning of "unhelpful" -- are not. Any ideas and comments are very welcome!

The Titleblacklist is used to block certain file names (generic, spam, etc.) through mw:Extension:Title blacklist when users try to upload files with these invalid names. However, regular _expression_ is not perfect and there are still some files with "unhelpful" names got uploaded -- e.g. File:Img-071129152243-0001.png and those in the move log whose change reason is meaningless or ambiguous, which now requires human to identify. That's why I'm thinking about using a machine learning model to help identify these files.TASK DETAILhttps://phabricator.wikimedia.org/T182849EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: thiemowmde, Aklapper, Abit, Ramsey-WMF, mpopov, chelsyx, Lahi, PDrouin-WMF, Gq86, E1presidente, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Claimed] T179450: Documentation of SDoC findings

2017-12-14 Thread chelsyx
chelsyx claimed this task.chelsyx moved this task from Backlog to In progress on the Discovery-Analysis (Current work) board.
TASK DETAILhttps://phabricator.wikimedia.org/T179450WORKBOARDhttps://phabricator.wikimedia.org/project/board/1241/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Capt_Swing, Liuxinyu970226, debt, Nuria, Ramsey-WMF, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, Avner, Gehel, FloNight, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T182849: Identify unhelpful file names on commons

2017-12-13 Thread chelsyx
chelsyx created this task.chelsyx added projects: Structured-Data-Commons, Discovery-Analysis.Herald added a subscriber: Aklapper.Herald added a project: Wikidata.
TASK DESCRIPTIONIn T177353, we were asked to get a count of files with unhelpful names. To identify unhelpful file names, we can extract the old and new file names from the move log whose change reason is meaningless or ambiguous, and then train a classification model.

Putting this project in the backlog now. I will pick it up when we have some bandwidth.TASK DETAILhttps://phabricator.wikimedia.org/T182849EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, Abit, Ramsey-WMF, mpopov, chelsyx, Lahi, PDrouin-WMF, Gq86, E1presidente, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T177358: Metrics for SDoC: translations

2017-12-12 Thread chelsyx
chelsyx updated the task description. (Show Details)
CHANGES TO TASK DESCRIPTION...* [x] how many files/descriptions are in multiple languages?...** [x] How many files are in lang X?
** [x] How many have multiple languages in them?
** [x] How many Western industrialized languages?...TASK DETAILhttps://phabricator.wikimedia.org/T177358EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: mpopov, chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177358: Metrics for SDoC: translations

2017-12-12 Thread chelsyx
chelsyx added a comment.
We parsed the wikitext of all files in Commons xml data dumps of November 20, 2017, and extract the language templates in them (e.g. {{en}}, {{LangSwitch}}). Out of the total 43,268,565 files, 14,848,551 (34.32%) files don't have any language templates, 23,780,247 (54.96%) files use only 1 language.
F11792338: files_by_n_languages.png

40.1% of all files have English templates, 9.38% of files use German, and 6.2% of files have description in languages which are not in the top 20. 
F11792361: top20_languages_nfiles.png

For those files without language template, we use the langdetect package to detect their languages. We cannot detect any language in 556,684 files (1.29% of all 43,268,565 files). We detect 1 language for 7,577,789 (17.51%) files.
F11795099: files_by_n_detected_languages.png

We detect English in 30.25% of all 43,268,565 files, detect German in 3.93% of files.
F11795155: top20_detected_languages_nfiles.pngTASK DETAILhttps://phabricator.wikimedia.org/T177358EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: mpopov, chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Blocker] T174519: [epic] SDoC: Determine baseline for metrics

2017-12-12 Thread chelsyx
chelsyx changed the status of subtask T177353: Metrics for SDoC: look at search hits based on which element the search is hitting from "Stalled" to "Open".
TASK DETAILhttps://phabricator.wikimedia.org/T174519EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Capt_Swing, Ramsey-WMF, SandraF_WMF, Abit, chelsyx, mpopov, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Status] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2017-12-12 Thread chelsyx
chelsyx changed the task status from "Stalled" to "Open".chelsyx raised the priority of this task from "Low" to "Normal".chelsyx added a comment.
We parsed the wikitext of all files in Commons xml data dumps of November 20, 2017. Out of the total 43,268,565 files, 41,796,560 (96.6%) files have a infobox, 41,309,028 (95.47%) have some contents in their description fields (description, title, depicted people, depicted place, etc).

Caveat:


There are a large number of infobox-like templates (e.g. Infobox_templates:_based_on_Information_template, Data_ingestion_layout_templates, templates only for one batch of uploads like this) with description fields of various names (e.g. some use commons_description instead of description). This makes counting very difficult because we cannot enumerate all of these infobox names and description field names.
Some users create their own templates on top of other infobox templates for upload convenience. This makes the file description masked -- they cannot be search. For example, the wikitext of File:Cyclopaedia, Chambers - Volume 1 - 0133.jpg is:


{{Cyclopaedia, Chambers page
 | volume = 1
 | prev = 0132
 | page = 0133
 | next = 0134
}}

A lot of the information we see on the web page is actually hidden in its template Template:Cyclopaedia,_Chambers_page. This makes it very hard to find this file through search, because search is done through the above shown wikitext of this file. We should encourage our users to clean up this kind of templates.TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2017-11-30 Thread chelsyx
chelsyx edited projects, added Discovery-Analysis (Current work); removed Discovery-Analysis.
TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Claimed] T177358: Metrics for SDoC: translations

2017-11-30 Thread chelsyx
chelsyx claimed this task.chelsyx moved this task from Backlog to In progress on the Discovery-Analysis (Current work) board.
TASK DETAILhttps://phabricator.wikimedia.org/T177358WORKBOARDhttps://phabricator.wikimedia.org/project/board/1241/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, Gq86, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T177534: Search Metrics for SDoC: eventlogging

2017-11-30 Thread chelsyx
chelsyx moved this task from In progress to Needs review on the Discovery-Analysis (Current work) board.chelsyx added a comment.
We computed several search metrics with event logging data in November 2017, and compare them with English Wikipedia. They are searches on desktop only, since we have very few searches on mobile web on Commons (less than 100 search result pages daily).

The zero results rate for full-text search is slightly lower on Commons compared to English Wikipedia:
F11091871: zrr_all.png

However, the clickthrough rate for full-text search is much lower than English Wikipedia, only 10.42%:
F11091881: ctr_all.png

Also, users on Commons are much more likely to click to see other pages of search results:
F11091886: serp_offset_all.png

See https://github.com/wikimedia-research/SDoC-Initial-Metrics/tree/master/T177534 for more results.TASK DETAILhttps://phabricator.wikimedia.org/T177534WORKBOARDhttps://phabricator.wikimedia.org/project/board/1241/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: mpopov, chelsyx, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Ramsey-WMF, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Claimed] T177534: Metrics for SDoC: eventlogging

2017-11-21 Thread chelsyx
chelsyx claimed this task.chelsyx edited projects, added Discovery-Analysis (Current work); removed Discovery-Analysis.
TASK DETAILhttps://phabricator.wikimedia.org/T177534EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: mpopov, chelsyx, debt, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Ramsey-WMF, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2017-11-07 Thread chelsyx
chelsyx added a comment.
Status of tasks of this ticket:


Search hits based on which element the search is hitting: file name vs. description vs. category
This is not feasible currently. Possible solution is T177353#3716344, and we will need help from search backend team.




"Unfindable" images metrics: lack of categorization, unhelpful file name, no description (or poor description)
Categories: The number of files having a "needing categories" category and the breakdown is shown on T177353#3743257. We have a query to count the number of files by the number of categories, category type (hidden vs not) and media type. But we are having some problems when using this query on mysql database. Possible solution is available, but it would take some time.
Description: We could use advanced search and/or parse the page content with hive (using a experimental table set up by analytics), but it would take some time.
File name: We could get this done by machine learning as described in T177353#3712897, but it would take some time to train and tune the model.




Investigate file annotations and if any tracking (logging) of them are available
Done. See T177353#3711572



Given the difficulties we are facing as described above, @debt and I decide to put this ticket to backlog and work on other SDoC metrics first.TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2017-11-07 Thread chelsyx
chelsyx added a comment.
On November 7, the number of files having a "needing categories" category is 4,268,386 (10%). The following table break down the counts by media type:


img_media_typeneed_catn_filesproportion
bitmapno3617694184.47%
bitmapyes42072329.82%
drawingno11673892.73%
drawingyes177440.04%
audiono7922231.85%
audioyes26250.01%
videono719440.17%
videoyes366130.09%
multimediano40%
officeno3510350.82%
officeyes41720.01%

TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2017-10-30 Thread chelsyx
chelsyx added a comment.

In T177353#3714007, @debt wrote:
Oh, that looks like that will be quite interesting, @chelsyx, although it looks like it might be a bit of manual work involved.


Getting data from the move log is easy, but it will take some time to train and adjust the model. @debt @Ramsey-WMF Let me know if you want me to spend time on getting other metrics done rather than this.TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2017-10-30 Thread chelsyx
chelsyx added a comment.

In T177353#3716995, @debt wrote:
Great idea, @EBernhardson, let's do it! @chelsyx can you get that sampling from the data we already have?


@debt Yes, I can get those queries from TestSearchSatisfaction2 table. We will need help from @EBernhardson to run them against test cluster and check the results.TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2017-10-26 Thread chelsyx
chelsyx added a comment.
For unhelpful file names, I want to extract the old and new file names from the move log whose change reason is meaningless or ambiguous, and then train a model to classify these file names. As far as I know, short text classification like this is a bit tricky.. @mpopov do you have any suggestion?TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2017-10-25 Thread chelsyx
chelsyx updated the task description. (Show Details)
CHANGES TO TASK DESCRIPTION...* [x] investigate file annotations and if any tracking (logging) of them are available...TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2017-10-25 Thread chelsyx
chelsyx updated the task description. (Show Details)
CHANGES TO TASK DESCRIPTION...** After talking with @EBernhardson , we decided this is not feasible since we don't record this information nowTASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2017-10-25 Thread chelsyx
chelsyx added a comment.
There are 142,994 files with annotations (ImageNote), follow this link for the most current count.

The revision history of annotations are there, along with other page revision history, for example: https://commons.wikimedia.org/w/index.php?title=File:Henley_2009_women.jpg="">

@Ramsey-WMF Is this what you want?TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2017-10-24 Thread chelsyx
chelsyx added a subscriber: EBernhardson.chelsyx updated the task description. (Show Details)
CHANGES TO TASK DESCRIPTION...* [x] file name vs. description vs. category
* [] "Unfindable" images metrics 
* []After talking with @EBernhardson , we decided this is not feasible since we don't record this information now.
* [] "Unfindable" images metrics: lack of categorization, unhelpful file name, no description (or poor description)...TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions

2017-10-24 Thread chelsyx
chelsyx added a comment.
Good idea! Thanks @Nuria !TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions

2017-10-23 Thread chelsyx
chelsyx added a comment.
Hi @Nuria , the numbers I showed above are cumulative sum at the end of each month, while the numbers you talked about are newly uploads for each month. From my query, for Dec 2016, the number of newly uploaded files by bots are 392,566, by users = 392,786. This is closed to what is shown on https://stats.wikimedia.org/wikispecial/EN/TablesWikipediaCOMMONS.htm.

I think the differences came from two sources:
1, I assume the numbers on https://stats.wikimedia.org/wikispecial/EN/TablesWikipediaCOMMONS.htm are computed at the end of each month and files could be deleted afterwards. For the numbers above, I used the image table and only counts the files that are still there on Oct 12, 2017.
2, According to commons bots, not all accounts being operated as bots has a bot flag, so I also include accounts with the keywords "bot_flag" or "bots" (see the query below).

Query for counting newly uploaded files on commons:

SELECT LEFT(img_timestamp, 6) AS yr_month, user_group, COUNT(*) AS n_files
FROM (
-- Get active/inactive bots
SELECT ug_user AS user_id, ug_group AS user_group
FROM user_groups
WHERE ug_group = 'bot'
UNION
SELECT ufg_user AS user_id, ufg_group AS user_group
FROM user_former_groups
WHERE ufg_group = 'bot'
UNION
-- Get user ids with bot categories in their user pages
SELECT user.user_id, 'bot' AS user_group
FROM user INNER JOIN (
  -- all user page names with bot category
  SELECT REPLACE(page.page_title, '_', ' ') AS user_name
  FROM page INNER JOIN (
-- page ids with bot categories 
SELECT DISTINCT cl_from AS page_id
FROM categorylinks
WHERE cl_to REGEXP '_(bot_flag|bots)(_|$)'
  AND cl_type = 'page'
  ) AS bot_cat ON page.page_id=bot_cat.page_id
  WHERE page_namespace = 2
) AS bot_name ON user.user_name=bot_name.user_name
) AS bots RIGHT JOIN image ON bots.user_id = image.img_user
GROUP BY LEFT(img_timestamp, 6), user_group;TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Nuria, Liuxinyu970226, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Claimed] T177353: Metrics for SDoC: look at search hits based on which element the search is hitting

2017-10-18 Thread chelsyx
chelsyx claimed this task.chelsyx edited projects, added Discovery-Analysis (Current work); removed Discovery-Analysis.
TASK DETAILhttps://phabricator.wikimedia.org/T177353EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions

2017-10-16 Thread chelsyx
chelsyx added a comment.
Codebase and output: https://github.com/wikimedia-research/SDoC-Initial-Metrics/tree/master/T177354TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Liuxinyu970226, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions

2017-10-13 Thread chelsyx
chelsyx added a comment.
@mpopov yup, I will put my stuff in the repo.TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T177354: Metrics for SDoC: look at contributions

2017-10-12 Thread chelsyx
chelsyx updated the task description. (Show Details)
CHANGES TO TASK DESCRIPTION...* [x] individuals
* [x] mass-tools/institutions
* [x] number of contributions as of present time
* [x] compare to what it looked like 30 days agoTASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions

2017-10-12 Thread chelsyx
chelsyx added a comment.
The following two graphs breakdown the number by month:
F10169825: nfile_bot_month.png

F10169827: nfile_bot_month_prop.pngTASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions

2017-10-12 Thread chelsyx
chelsyx added a comment.
Updated: On Oct 12, 2017, the number of files uploaded by bots is 9,390,721 (22.03%), and the number of files uploaded by users is 33,241,541 (77.97%). The following table break down the counts by media type:


Media TypeUser GroupNumber of FilesProportion
bitmapuser3135534373.55%
bitmapbot884344720.74%
drawinguser9059642.13%
drawingbot2705160.63%
audiouser6985661.64%
audiobot956460.22%
videouser717380.17%
videobot363290.09%
multimediauser40%
officeuser2099260.49%
officebot1447830.34%
TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions

2017-10-11 Thread chelsyx
chelsyx added a comment.
@mpopov Looks like the file type categorization on commons is messier than we thought...
For example, File:Krazy_Kat_Bugolist_1916_silent.ogv is an ogv file, but its img_minor_mime is ogg, img_major_mime is application, and img_media_type is video. This is the same for other ogv files. While for ogg files like File:Whitenoisesound.ogg, its img_minor_mime is ogg, img_major_mime is application, and img_media_type is audio.

Not sure if the field img_media_type is more trustworthy...TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177354: Metrics for SDoC: look at contributions

2017-10-11 Thread chelsyx
chelsyx added a comment.
Hey @chelsyx - what time frame does this cover?

Jumping in to say this looks like it's from launch of Commons to now.

Thanks @mpopov ! Yes, this is the file counts on Oct 10.

Can we also get a count of how this has changed over the last week and compare that to the last 30 days? It'd be interesting to see if the numbers are fairly consistent (individual vs institution) or if they have changed quite a bit when extending the time scope.

@chelsyx this may be useful: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits as it contains monthly snapshots of the page & user tables as of April 2017

Unfortunately, the mediawiki snapshot doesn't has the image table which describes images and other uploaded files.TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T177354: Metrics for SDoC: look at contributions

2017-10-10 Thread chelsyx
chelsyx moved this task from In progress to Needs review on the Discovery-Analysis (Current work) board.chelsyx added a comment.
The number of files uploaded by bots is 9,390,408 (22.04%), and the number of files uploaded by users is 33,222,828 (77.96%). The following table break down the counts by media type:


img_major_mimeuser_groupn_files
applicationuser927448
applicationbot273617
audiouser12479
audiobot2206
imageuser32242778
imagebot9113650
videouser40133
videobot935



Query:

SELECT img_major_mime, user_group, COUNT(*) AS n_files
FROM (
-- Get active/inactive bots
SELECT ug_user AS user_id, ug_group AS user_group
FROM user_groups
WHERE ug_group = 'bot'
UNION
SELECT ufg_user AS user_id, ufg_group AS user_group
FROM user_former_groups
WHERE ufg_group = 'bot'
UNION
-- Get user ids with bot categories in their user pages
SELECT user.user_id, 'bot' AS user_group
FROM user INNER JOIN (
  -- all user page names with bot category
  SELECT REPLACE(page.page_title, '_', ' ') AS user_name
  FROM page INNER JOIN (
-- page ids with bot categories 
SELECT DISTINCT cl_from AS page_id
FROM categorylinks
WHERE cl_to REGEXP '_(bot_flag|bots)(_|$)'
  AND cl_type = 'page'
  ) AS bot_cat ON page.page_id=bot_cat.page_id
  WHERE page_namespace = 2
) AS bot_name ON user.user_name=bot_name.user_name
) AS bots RIGHT JOIN image ON bots.user_id = image.img_user
GROUP BY img_major_mime, user_group;TASK DETAILhttps://phabricator.wikimedia.org/T177354WORKBOARDhttps://phabricator.wikimedia.org/project/board/1241/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T177354: Metrics for SDoC: look at contributions

2017-10-06 Thread chelsyx
chelsyx moved this task from Needs triage to Current work on the Discovery-Analysis board.chelsyx edited projects, added Discovery-Analysis (Current work); removed Discovery-Analysis.
TASK DETAILhttps://phabricator.wikimedia.org/T177354WORKBOARDhttps://phabricator.wikimedia.org/project/board/1850/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Izno, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Claimed] T177354: Metrics for SDoC: look at contributions

2017-10-06 Thread chelsyx
chelsyx claimed this task.
TASK DETAILhttps://phabricator.wikimedia.org/T177354EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Izno, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries

2016-10-04 Thread chelsyx
chelsyx added a comment.
Thank you @debt! :)TASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries

2016-10-04 Thread chelsyx
chelsyx added a comment.
Thanks @debt! Updated on Commons!TASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries

2016-10-03 Thread chelsyx
chelsyx added a comment.
Modified: F4553759: report.pdf
@debt Please let me know if there is anything else need to be changed.TASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries

2016-10-03 Thread chelsyx
chelsyx added a comment.
Thanks everyone! I've uploaded the report to the commons: https://commons.wikimedia.org/wiki/File:Exploration_on_the_Use_of_WDQS_-_Breakdown_by_Geography,_User_Agent_and_Referer_Class.pdfTASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries

2016-09-29 Thread chelsyx
chelsyx added a comment.
@Smalyshev what do you mean by "error responses"?
Here is an example of my query:

SELECT CONCAT(year,'-',month,'-',day) AS dt, 
PERCENTILE_APPROX(time_firstbyte, 0.5) AS median_time_firstbyte,
PERCENTILE(response_size, 0.5) AS median_response_size
FROM webrequest
WHERE year = 2016 AND month = 07 AND day = 01
AND webrequest_source = 'misc'
AND uri_host = 'query.wikidata.org'
AND uri_path = '/bigdata/namespace/wdq/sparql'
AND http_status IN('200','304')
AND INSTR(uri_query, '?query=') > 0
GROUP BY CONCAT(year,'-',month,'-',day);TASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries

2016-09-29 Thread chelsyx
chelsyx added a comment.
Updated Reviewers: F4537643: report.pdf 
@debt and @Smalyshev, your suggestions are very welcome!!! :)TASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries

2016-09-19 Thread chelsyx
chelsyx added a comment.
3rd draft: F4487819: report.pdfTASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries

2016-09-09 Thread chelsyx
chelsyx added a comment.
Second Draft: F4452046: report.pdfTASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Addshore, Aklapper, mpopov, Smalyshev, debt, mschwarzer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T143762: WDQS: Geographic breakdown of SPARQL queries

2016-08-31 Thread chelsyx
chelsyx added a comment.
First draft of the report: F4420629: report.pdf
I put a lot of stuff into report. However, because of my lack of domain knowledge, I don't have a very clear idea about what question is meaningful/useful to answer. So any suggestion is very welcome!!!TASK DETAILhttps://phabricator.wikimedia.org/T143762EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: chelsyxCc: Aklapper, mpopov, Smalyshev, debt, mschwarzer, MelodyKramer, Avner, Gehel, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs