[Wikidata-bugs] [Maniphest] T272192: Migrate to new Wikidata Analytics

2021-01-15 Thread GoranSMilovanovic
GoranSMilovanovic created this task. GoranSMilovanovic added projects: Wikidata-Bridge, WMDE-Analytics-Engineering, User-GoranSMilovanovic. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION The new Wikidata Analytics service

[Wikidata-bugs] [Maniphest] T202610: Cognate dashboard requests tracking

2020-12-19 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. The requests presented in T202610#5535326 <https://phabricator.wikimedia.org/T202610#5535326> and T202610#5535427 <https://phabricator.wikimedia.org/T202610#5535427> are unclear. Please define the requests in a clear and concise la

[Wikidata-bugs] [Maniphest] T270109: Hoover Inequality Score Data Retrival and Calculation

2020-12-14 Thread GoranSMilovanovic
GoranSMilovanovic added projects: WMDE-Analytics-Engineering, User-GoranSMilovanovic. GoranSMilovanovic claimed this task. GoranSMilovanovic added a comment. @Jan_Dittrich Got it. I will get back to you if it turns that I need more info. TASK DETAIL https://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] T269587: Low hanging fruits for the WMDE Data Quality WD/WB Team

2020-12-14 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Can we resolve this ticket or do we need anything else here? TASK DETAIL https://phabricator.wikimedia.org/T269587 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc

[Wikidata-bugs] [Maniphest] T269587: Low hanging fruits for the WMDE Data Quality WD/WB Team

2020-12-11 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Here it goes: F33943238: propertyLanguages_20201211.csv <https://phabricator.wikimedia.org/F33943238> TASK DETAIL https://phabricator.wikimedia.org/T269587 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings

[Wikidata-bugs] [Maniphest] T269587: Low hanging fruits for the WMDE Data Quality WD/WB Team

2020-12-11 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Of course, it will be produced and posted here during the day. TASK DETAIL https://phabricator.wikimedia.org/T269587 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc

[Wikidata-bugs] [Maniphest] T195702: track quality of all/top 10000 Wikidata items over time

2020-12-10 Thread GoranSMilovanovic
GoranSMilovanovic closed subtask T234161: WD Data Quality: compare quality vs usage on commons vs everything else as Resolved. TASK DETAIL https://phabricator.wikimedia.org/T195702 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc

[Wikidata-bugs] [Maniphest] T234161: WD Data Quality: compare quality vs usage on commons vs everything else

2020-12-10 Thread GoranSMilovanovic
GoranSMilovanovic closed this task as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T234161 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Lydia_Pintscher, WMDE-leszek, GoranSMilovanovic, Aklapper,

[Wikidata-bugs] [Maniphest] T269587: Low hanging fruits for the WMDE Data Quality WD/WB Team

2020-12-10 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. In respect to T269587#6680787 <https://phabricator.wikimedia.org/T269587#6680787> - we need to change the anchor (languages w. Wikimedia Language Code). TASK DETAIL https://phabricator.wikimedia.org/T269587 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T269587: Low hanging fruits for the WMDE Data Quality WD/WB Team

2020-12-09 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher > Check coverage of labels, descriptions, aliases on Properties Please see the `csv` file attached. Fields: - `property` - `labels` - how many labels - `aliases` - in how many different languages do we find alia

[Wikidata-bugs] [Maniphest] T269587: Low hanging fruits for the WMDE Data Quality WD/WB Team

2020-12-09 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Ok, the data reported in T269587#6679451 <https://phabricator.wikimedia.org/T269587#6679451> seem to be fine. The the list of all "hanging items" - items with no `P31`, `P279`, or `P361` value - relative to what was fo

[Wikidata-bugs] [Maniphest] T269587: Low hanging fruits for the WMDE Data Quality WD/WB Team

2020-12-09 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher The data reported in T269587#6679451 <https://phabricator.wikimedia.org/T269587#6679451> will have to undergo revision, I have spotted a glitch in my filtering procedures in Pyspark. TASK DETAIL https://phabricator.wikimedia.org/T

[Wikidata-bugs] [Maniphest] T269587: Low hanging fruits for the WMDE Data Quality WD/WB Team

2020-12-09 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher > properties that can be used to check completeness (e.g. "number of children" + "number of participants") > find a list of such "structural" properties Well, what I did was the following:

[Wikidata-bugs] [Maniphest] T269587: Low hanging fruits for the WMDE Data Quality WD/WB Team

2020-12-09 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher > How many entities do we have that are not classified via `instance of P31`, `subsclass of P279`, and `part of P261`? According to the most recent version of the hdfs version of the Wikidata JSON dump (snapshot: `2020-11

[Wikidata-bugs] [Maniphest] T269587: Low hanging fruits for the WMDE Data Quality WD/WB Team

2020-12-09 Thread GoranSMilovanovic
GoranSMilovanovic updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T269587 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Silvan_WMDE, Lydia_Pintscher, GoranSMilovanovic, Aklapper, Akuckartz, Nandana

[Wikidata-bugs] [Maniphest] T269587: Low hanging fruits for the WMDE Data Quality WD/WB Team

2020-12-07 Thread GoranSMilovanovic
GoranSMilovanovic created this task. GoranSMilovanovic added projects: User-GoranSMilovanovic, WMDE-Analytics-Engineering, Wikidata. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION - Produce all "immediately" available indicators derived from the discussion i

[Wikidata-bugs] [Maniphest] T267635: Get 2020 user/editcount data to determine count at percentiles

2020-11-30 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Jan_Dittrich Could you involve @Lydia_Pintscher in the email in relation to this ticket and the inequality measures? ^^ Again, please: did we resolve this one? TASK DETAIL https://phabricator.wikimedia.org/T267635 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T267635: Get 2020 user/editcount data to determine count at percentiles

2020-11-25 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Jan_Dittrich Do we need anything else here? I guess we can continue the discussion on inequality measures that you have started via e-mail? TASK DETAIL https://phabricator.wikimedia.org/T267635 EMAIL PREFERENCES https://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] T267635: Get 2020 user/editcount data to determine count at percentiles

2020-11-19 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Jan_Dittrich Results share via e-mail. TASK DETAIL https://phabricator.wikimedia.org/T267635 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: WMDE-leszek, GoranSMilovanovic, Aklapper

[Wikidata-bugs] [Maniphest] T267143: track average number of labels over time for different groups of Items

2020-11-19 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher We're running in production now: https://wmdeanalytics.wmflabs.org/WD_LanguagesLandscape/ The fact that some recent data points (e.g. for the November 02 snapshot of the WD dump) are missing in some charts is an anomaly that should

[Wikidata-bugs] [Maniphest] T267143: track average number of labels over time for different groups of Items

2020-11-19 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher > Then maybe the header could say "Q6999 and subclasses" or similar? Of course. > Besides that: all good to go for me :) Ok. I need to run just one additional data consistency check, and then we go in p

[Wikidata-bugs] [Maniphest] T267143: track average number of labels over time for different groups of Items

2020-11-19 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher The ASTRONOMICAL OBJECTS we use here are: - everything what is found in the subclasses of Q6999 and that is exactly what the Python blob that you refer to lists + all immediate P31 <https://phabricator.wikimedia.org/

[Wikidata-bugs] [Maniphest] T267143: track average number of labels over time for different groups of Items

2020-11-19 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. > For the last graphs: can we have the inverse? So the values for all Items except astronmical objects and scientific papers? @Lydia_Pintscher Please scroll down the dashboard landing page to discover the WIKIDATA - (ASTRO NOMY + SCHOLARLY PAP

[Wikidata-bugs] [Maniphest] T267143: track average number of labels over time for different groups of Items

2020-11-18 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Test server: http://datakolektiv.org/app/WD_LanguagesLandscape Please allow me some time to figure why not all datasets were updated up to the latest update timestamp: `2020/11/02`. However, this seems to be the solution that you

[Wikidata-bugs] [Maniphest] T267144: track average Item quality over time for different groups of Items

2020-11-18 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Ladsgroup Get in touch, I've managed to solve > the biggest problem is that first I need to get huge list of all items that are one of scientific paper or astronomical object in my work on T267143 <https://phabricator.wikimedia.org/T267143&g

[Wikidata-bugs] [Maniphest] T267143: track average number of labels over time for different groups of Items

2020-11-11 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher I am on it. TASK DETAIL https://phabricator.wikimedia.org/T267143 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Lea_WMDE, Aklapper, Lydia_Pintscher, Akuckartz, Nandana

[Wikidata-bugs] [Maniphest] T267635: Get 2020 user/editcount data to determine count at percentiles

2020-11-10 Thread GoranSMilovanovic
GoranSMilovanovic claimed this task. GoranSMilovanovic added a project: User-GoranSMilovanovic. TASK DETAIL https://phabricator.wikimedia.org/T267635 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: WMDE-leszek, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] T267143: track average number of labels over time for different groups of Items

2020-11-03 Thread GoranSMilovanovic
GoranSMilovanovic claimed this task. GoranSMilovanovic added a project: User-GoranSMilovanovic. TASK DETAIL https://phabricator.wikimedia.org/T267143 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Lea_WMDE, Aklapper

[Wikidata-bugs] [Maniphest] T194748: Research: How are bot edits different than human edits on Wikidata?

2020-10-27 Thread GoranSMilovanovic
GoranSMilovanovic removed a project: Wikidata. GoranSMilovanovic added a comment. @Lydia_Pintscher @Jan_Dittrich This ticket is opened and assigned to me for two years already. Please let me know if additional details are available and should we plan to proceed

[Wikidata-bugs] [Maniphest] T259105: Qurator: Data about Current Events

2020-10-20 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. 2020/10/20: - updated module crashed again: no `type` field vas received from the Wikidata API; - fixing now. TASK DETAIL https://phabricator.wikimedia.org/T259105 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] T259105: Qurator: Data about Current Events

2020-10-18 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. 2020/10/18: - freeze again; problem detected: - MediaWiki API did not return one field (`old_revid`); - action: potential bug fix, restart system, continue monitoring. TASK DETAIL https://phabricator.wikimedia.org/T259105 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T259105: Qurator: Data about Current Events

2020-10-14 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher As of the English labels related problem (i.e. the dashboard reporting items without English labels while the same items do have English labels on Wikidata indeed), I was able to detect only cases where the label was created just very

[Wikidata-bugs] [Maniphest] T259105: Qurator: Data about Current Events

2020-10-14 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. - 18:59 CET bug fix - restarted the app. TASK DETAIL https://phabricator.wikimedia.org/T259105 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: WMDE-leszek, Tobi_WMDE_SW, Lydia_Pintscher

[Wikidata-bugs] [Maniphest] T259105: Qurator: Data about Current Events

2020-10-14 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher @WMDE-leszek The dashboard is live: http://datakolektiv.org:3838/WD_CurrentEvents/ - strict monitoring procedures are in place; - I will be reporting back in case of any errors/fixes; - **please** let me know if the case when

[Wikidata-bugs] [Maniphest] T259105: Qurator: Data about Current Events

2020-10-14 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher @WMDE-leszek Status: - the dashboard is currently moving from prototype to 0.0.1 - it is "devirtualized" and will be run directly from a Shiny Server instance on the test server to - enable for close monitoring of th

[Wikidata-bugs] [Maniphest] T253552: Detailed Reports from game DB

2020-10-06 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher As per your request in a recent e-mail: - all duplicated judgments were filtered out, - the dashboard is ready for you to test: https://wmdeanalytics.wmflabs.org/WD_GameReferenceHunt/ TASK DETAIL https

[Wikidata-bugs] [Maniphest] T234161: WD Data Quality: compare quality vs usage on commons vs everything else

2020-09-06 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. `2020/09/06`: - Need to check one chart: `Diversity of item re-use vs item quality` - some changes following the introduction of optimized data production. TASK DETAIL https://phabricator.wikimedia.org/T234161 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T259105: Qurator: Data about Current Events

2020-09-06 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. - Another bug fixed in respect to T259105#6436197 <https://phabricator.wikimedia.org/T259105#6436197>; - continuing monitoring & debugging. TASK DETAIL https://phabricator.wikimedia.org/T259105 EMAIL PREFERENCES https://phabricator.wik

[Wikidata-bugs] [Maniphest] T259105: Qurator: Data about Current Events

2020-09-04 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. - Another type of failure for the news module detected; - fixing now. TASK DETAIL https://phabricator.wikimedia.org/T259105 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: WMDE-leszek

[Wikidata-bugs] [Maniphest] T125095: Create new page view dashboard for Wikidata

2020-09-02 Thread GoranSMilovanovic
GoranSMilovanovic added a subscriber: Lydia_Pintscher. GoranSMilovanovic added a comment. @Addshore @Lydia_Pintscher Let me know if extending the Pageviews per namespace from Wikidata dashboard <https://wmdeanalytics.wmflabs.org/WD_pageviewsPerNamespace/> for additional data feel

[Wikidata-bugs] [Maniphest] T204842: Track the usage of Wikidata entities on main namespace pages in Wikimedia projects

2020-09-02 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher @WMDE-leszek @Addshore Does our Wikidata Usage and Coverage in WMF Projects <https://wmdeanalytics.wmflabs.org/WD_percentUsageDashboard/> dashboard provide a solution for this? I think it does? TASK DETAIL

[Wikidata-bugs] [Maniphest] T259105: Qurator: Data about Current Events

2020-09-02 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. - Gerrit repo requested. TASK DETAIL https://phabricator.wikimedia.org/T259105 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: WMDE-leszek, Tobi_WMDE_SW, Lydia_Pintscher, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] T259105: Qurator: Data about Current Events

2020-09-02 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. Found three ways in which `Q_CE_02-WDNews.R` module - fetches news articles from NEWSRIVER.io - can fail; - fixed all three; - continuing to monitor. TASK DETAIL https://phabricator.wikimedia.org/T259105 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-08-27 Thread GoranSMilovanovic
GoranSMilovanovic closed this task as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T154601 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Mahir256, Lucas_Werkmeister_WMDE, Ladsgroup, Bugreporter, W

[Wikidata-bugs] [Maniphest] T108931: [Epic] Improve metrics and statistics for wikidata

2020-08-27 Thread GoranSMilovanovic
GoranSMilovanovic closed subtask T154601: Grafana: wikidata-datamodel-terms doesnt update anymore as Resolved. TASK DETAIL https://phabricator.wikimedia.org/T108931 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: #user

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-08-24 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Resolve or inspect further? TASK DETAIL https://phabricator.wikimedia.org/T154601 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Mahir256, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-08-24 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher > The huge difference between average labels/item and average descriptions/item is quite astonishing... I was just about ask: do we need to worry about this? Are the numbers plausible, for whatever reason the difference mi

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-08-21 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher All changes implemented now, test server: http://datakolektiv.org/WD_LanguagesLandscape/ TASK DETAIL https://phabricator.wikimedia.org/T154601 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-08-20 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher > for English for example the number of descriptions and the number of labels seem identical for some of the last values I checked. This should be fixed now. While it is impossible to reconstruct the exact context in wh

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-08-20 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher > for Aliases: I assume it's counting two aliases on the same Item in the same language as two and not one? Just making sure. No (good catch!): it is now fixed to do that. > for English for example the number of descri

[Wikidata-bugs] [Maniphest] T253552: Detailed Reports from game DB

2020-08-20 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher What is the status of this ticket? Do we need any additional features or work invested here or should we close it? TASK DETAIL https://phabricator.wikimedia.org/T253552 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] T259105: Qurator: Data about Current Events

2020-08-20 Thread GoranSMilovanovic
GoranSMilovanovic added a subscriber: WMDE-leszek. GoranSMilovanovic added a comment. @Lydia_Pintscher @WMDE-leszek Dashboard prototype is at https://wmdeanalytics.wmflabs.org/WD_CurrentEvents/ TASK DETAIL https://phabricator.wikimedia.org/T259105 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-08-18 Thread GoranSMilovanovic
GoranSMilovanovic closed this task as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T248308 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Samantha_Alipio_WMDE, MGerlach, JAllemandou, Lucas_Werkme

[Wikidata-bugs] [Maniphest] T259105: Qurator: Data about Current Events

2020-08-15 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Status: - implement news search: DONE - proof-of-concept: fetch relevant news for recently edited Wikidata items; Next steps: - decoration: find out about the most popular classes, maybe geo-coordinates where present

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-08-15 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher > for English for example the number of descriptions and the number of labels seem identical for some of the last values I checked. And then there is a very very steep decline for the last value. That seems suspici

[Wikidata-bugs] [Maniphest] T259105: Qurator: Data about Current Events

2020-08-12 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher - We have live revision updates from the API implemented, and - real time last 10 minutes and last one hour aggregated revision frequencies tracked. Now: - implement news search; - decoration: find out about the most

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-08-12 Thread GoranSMilovanovic
GoranSMilovanovic lowered the priority of this task from "High" to "Medium". TASK DETAIL https://phabricator.wikimedia.org/T154601 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Mahir256, Nuria, Lu

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-08-09 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Please check-out the dashboard: http://wmdeanalytics.wmflabs.org/WD_LanguagesLandscape/ - the **Datamodel:Terms** (landing) tab. I am slightly unsure about the numbers reported on item descriptions. What do you think? TASK DETAIL

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-08-09 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. - Back-end completed; the data will live here: https://analytics.wikimedia.org/published/datasets/wmde-analytics-engineering/Wikidata/wd_datamodel_terms/ TASK DETAIL https://phabricator.wikimedia.org/T154601 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T249654: Categorize different types of Wikidata re-use within Wikimedia projects

2020-08-07 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Isaac Let me know if you need any help. I will take a look at your notes now. Very, very useful work. TASK DETAIL https://phabricator.wikimedia.org/T249654 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Isaac

[Wikidata-bugs] [Maniphest] T249654: Categorize different types of Wikidata re-use within Wikimedia projects

2020-08-05 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Isaac Thank you for this analysis - really useful! TASK DETAIL https://phabricator.wikimedia.org/T249654 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Isaac, GoranSMilovanovic Cc: Akuckartz, calbon, Addshore

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-08-03 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Hey thank you, I see know. The languages on the three above mentioned chart seem to sum up to 100%, which probably means - that the charts represent the proportion of labels/aliases/descriptions in a given language - relative

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-08-02 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Addshore @Lydia_Pintscher Since the wikidata-datamodel-terms <https://grafana.wikimedia.org/d/00168/wikidata-datamodel-terms?orgId=1=All=30m> Grafana dashboard does not show any data, I find it difficult to understand the y-axis (%) in the

[Wikidata-bugs] [Maniphest] T253552: Detailed Reports from game DB

2020-08-02 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Please review: http://wmdeanalytics.wmflabs.org/WD_referenceHunt/ TASK DETAIL https://phabricator.wikimedia.org/T253552 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc

[Wikidata-bugs] [Maniphest] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-08-01 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher We forgot to mention this task in our recent 1:1. In the meantime, I've tested a 10% daily queries sample and the statistics of the smaller, previously used 1% daily queries sample, turn out to be quite representative. However

[Wikidata-bugs] [Maniphest] T253552: Detailed Reports from game DB

2020-07-30 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. - Gerrit repo requested. TASK DETAIL https://phabricator.wikimedia.org/T253552 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Ladsgroup, darthmon_wmde, ItamarWMDE, Tarrow, Aklapper

[Wikidata-bugs] [Maniphest] T253552: Detailed Reports from game DB

2020-07-30 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Please review: http://wmdeanalytics.wmflabs.org/WD_referenceHunt/ TASK DETAIL https://phabricator.wikimedia.org/T253552 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc

[Wikidata-bugs] [Maniphest] T259105: Qurator: Data about Current Events

2020-07-29 Thread GoranSMilovanovic
GoranSMilovanovic updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T259105 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Tobi_WMDE_SW, Lydia_Pintscher, GoranSMilovanovic, Aklapper, Akuckartz

[Wikidata-bugs] [Maniphest] T259105: Qurator: Data about Current Events

2020-07-29 Thread GoranSMilovanovic
GoranSMilovanovic added projects: WMDE-Analytics-Engineering, Wikidata. TASK DETAIL https://phabricator.wikimedia.org/T259105 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Tobi_WMDE_SW, Lydia_Pintscher, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-07-28 Thread GoranSMilovanovic
GoranSMilovanovic added a parent task: T253345: WMDE Wikidata (non-WDCM) Analytical Systems Optimization. TASK DETAIL https://phabricator.wikimedia.org/T154601 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Nuria

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-07-28 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher @GoranSMilovanovic Following our 2020/07/27 1:1 meeting: - the T154601#6337065 <https://phabricator.wikimedia.org/T154601#6337065> approach to incorporate the data from the wikidata-datamodel-terms <https://grafana.wikime

[Wikidata-bugs] [Maniphest] T253552: Detailed Reports from game DB

2020-07-27 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher @GoranSMilovanovic - We want to have a dashboard developed for this - and updated regularly. TASK DETAIL https://phabricator.wikimedia.org/T253552 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-07-27 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher > ... the editors should have access to this information as well as it's pretty vital to better understand our biases and gaps in language coverage. I would suggest expanding our existing Wikidata Languages Landscape &l

[Wikidata-bugs] [Maniphest] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-07-24 Thread GoranSMilovanovic
GoranSMilovanovic claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T154601 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Nuria, Lucas_Werkmeister_WMDE, Ladsgroup, Bugreporter, WMDE-leszek, Ivanhercaz, VIGNERON

[Wikidata-bugs] [Maniphest] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-07-24 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher You're welcome. > We should get this list once a quarter or so to find new uses of our data It is perfectly doable. Let's discuss this on Monday and see what data and statistics precisely do we want to have reported regula

[Wikidata-bugs] [Maniphest] T253552: Detailed Reports from game DB

2020-07-24 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher > The property_ratio and datatype_ratio files seem to be identical. Sorry, the same variable name was erroneously re-used in my code, here they are: F31947136: item_property_value_ratio.csv <https://phabricator.wikimed

[Wikidata-bugs] [Maniphest] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-07-22 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @JAllemandou Superfine. Enjoy your holidays! TASK DETAIL https://phabricator.wikimedia.org/T248308 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Samantha_Alipio_WMDE, MGerlach

[Wikidata-bugs] [Maniphest] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-07-22 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @JAllemandou Awesome! You did a nice EDA here + you've analyzed both `event.wdqs_external_sparql_query` and `event.wdqs_internal_sparql_query` - while I've focused only on the `external` source in my previous analyses... So, we do need ML to be able

[Wikidata-bugs] [Maniphest] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-07-21 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher There is absolutely no correlation between (a) how often does a particular `user_agent` value appears, and (b) the mean, or median WDQS processing time for that `user_agent`'s SPARQL queries. We can search for particular

[Wikidata-bugs] [Maniphest] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-07-21 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Let's see if there is anything interesting here: F31943519: ref_user_agent_sample.csv <https://phabricator.wikimedia.org/F31943519> Data: - it is produced from a sample of SPARQL querie

[Wikidata-bugs] [Maniphest] T253552: Detailed Reports from game DB

2020-07-20 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher From what we now have on https://wd-ref-island.toolforge.org/stats.php (thanks @ItamarWMDE): - three .csv files are delivered here (see below): - `datatype_ratio.csv` - per datatype statistics (aggregated

[Wikidata-bugs] [Maniphest] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-07-15 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @JAllemandou Got it, thanks. TASK DETAIL https://phabricator.wikimedia.org/T248308 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Samantha_Alipio_WMDE, MGerlach, JAllemandou

[Wikidata-bugs] [Maniphest] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-07-15 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @JAllemandou However... 0: jdbc:hive2://an-coord1001.eqiad.wmnet:1000> select user_agent_map from event.wdqs_external_sparql_query where year = 2020 and month = 5 and day = 1 limit 10; going to print operations logs printed operations l

[Wikidata-bugs] [Maniphest] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-07-15 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @JAllemandou Please see T248308#6080150 <https://phabricator.wikimedia.org/T248308#6080150>. I also see that `event.wdqs_external_sparql_query` encompasses the `user_agent_map` so yes I will go for it and not for `wmf.webrequest`. > I have

[Wikidata-bugs] [Maniphest] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-07-14 Thread GoranSMilovanovic
GoranSMilovanovic reopened this task as "Open". GoranSMilovanovic added a comment. - Re-opening the task to address the question of automated vs. non-automated SPARQL queries observed at the WDQS end-point. - Reference: WMDE in-house email and Google Meet discussions with @dar

[Wikidata-bugs] [Maniphest] [Claimed] T255949: Provide usage statistics on Wikibase APIs

2020-06-22 Thread GoranSMilovanovic
GoranSMilovanovic claimed this task. GoranSMilovanovic added a project: User-GoranSMilovanovic. TASK DETAIL https://phabricator.wikimedia.org/T255949 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Aklapper, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Commented On] T253552: Detailed Reports from game DB

2020-06-16 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher as agreed in our 1:1 today: - criterion: do not consider property-value pairs that were not reviewed by at least 5 editors; - crtierion: 95% of acceptance rate, meaning that everything up to 19 decisions must have a consensus. TASK

[Wikidata-bugs] [Maniphest] [Commented On] T253552: Detailed Reports from game DB

2020-06-15 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. Preliminary results based on T253552#6172533 <https://phabricator.wikimedia.org/T253552#6172533> @Ladsgroup datasets: Per datatype: datatypeaccepted rejected ratio entity-type 419 119 3.52 text 3

[Wikidata-bugs] [Maniphest] [Commented On] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2020-06-11 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Ladsgroup @Addshore Do you need any help around this thing? TASK DETAIL https://phabricator.wikimedia.org/T154601 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] [Commented On] T119976: Track number of stubs on top 20 wikipedias

2020-05-30 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher @Addshore Given the status reset - T119976#6178863 <https://phabricator.wikimedia.org/T119976#6178863> - of this task, what do we say: go, no go, priority? TASK DETAIL https://phabricator.wikimedia.org/T119976 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T253552: Detailed Reports from game DB

2020-05-28 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Ladsgroup Thanks for the datasets. @darthmon_wmde Thanks for the follow up. @ItamarWMDE Nice to meet you too :) My LDAP is GoranSMilovanovic and I was able to login to Toolforge from it (+2FA,) just a minute ago. TASK DETAIL https

[Wikidata-bugs] [Maniphest] [Commented On] T253552: Detailed Reports from game DB

2020-05-28 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. Please someone ping me when we have the data for this and let me know where do the data live. TASK DETAIL https://phabricator.wikimedia.org/T253552 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences

[Wikidata-bugs] [Maniphest] [Claimed] T253552: Detailed Reports from game DB

2020-05-28 Thread GoranSMilovanovic
GoranSMilovanovic claimed this task. GoranSMilovanovic added projects: WMDE-FUN-Sprint-2020-04-27, User-GoranSMilovanovic. TASK DETAIL https://phabricator.wikimedia.org/T253552 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc

[Wikidata-bugs] [Maniphest] [Closed] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-05-19 Thread GoranSMilovanovic
GoranSMilovanovic closed this task as "Resolved". GoranSMilovanovic added a comment. @WMDE-leszek Res, non verba. TASK DETAIL https://phabricator.wikimedia.org/T248308 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilo

[Wikidata-bugs] [Maniphest] [Commented On] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-05-18 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @WMDE-leszek @darthmon_wmde Do we need anything else here in the foreseeable future? TASK DETAIL https://phabricator.wikimedia.org/T248308 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-05-04 Thread GoranSMilovanovic
GoranSMilovanovic added a subscriber: Samantha_Alipio_WMDE. GoranSMilovanovic added a comment. @WMDE-leszek @darthmon_wmde @Lydia_Pintscher @Addshore @Gehel @Samantha_Alipio_WMDE This could be useful for tomorrow's discussion on repeated queries: F31802788: queries_Clustered_3000

[Wikidata-bugs] [Maniphest] [Commented On] T240466: Measure the impact of Tainted References Wikidata feature

2020-05-03 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @WMDE-leszek Please, what is the status of this task? TASK DETAIL https://phabricator.wikimedia.org/T240466 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Aklapper, Addshore, Jan_Dittrich

[Wikidata-bugs] [Maniphest] [Commented On] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-04-27 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. Update `Tue 28 Apr 2020 02:17:33 AM UTC` Here goes the update report on SPARQL feature selection via XGBoost: F31783672: WDQS Endpoint Analytics_20200427_B.nb.html <https://phabricator.wikimedia.org/F31783672> - The model perfo

[Wikidata-bugs] [Maniphest] [Commented On] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-04-27 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. Update `Mon 27 Apr 2020 10:31:05 PM UTC`: **The most frequently observed SPARQL queries dataset** - Selection criteria: the query was observed >= 50 times in the WDQS endpoint sample (approx. `1M` queries, `2020/04/01` - `2020/04/21`). - For e

[Wikidata-bugs] [Maniphest] [Commented On] T248308: Analyse a small sample of the most often used query patterns on WDQS

2020-04-27 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. Update `Mon 27 Apr 2020 10:10:23 PM UTC`: **Final reports** - Here goes the **Part A** of the Final Report which encompasses the Exploratory Data Analysis (EDA) only, encompassing: (1) the characteristics of the sample of SPARQL queries used

  1   2   3   4   5   >