[Wikidata-bugs] [Maniphest] [Claimed] T204440: analyze and visualize the identifier landscape of Wikidata

2019-01-29 Thread GoranSMilovanovic
GoranSMilovanovic claimed this task.GoranSMilovanovic edited projects, added User-GoranSMilovanovic, WMDE-Analytics-Engineering; removed Need-volunteer. TASK DETAILhttps://phabricator.wikimedia.org/T204440EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To

[Wikidata-bugs] [Maniphest] [Commented On] T210664: List of most used Wikidata entities

2019-01-26 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @abian The code is in place. Just ping me somewhere whenever you need an update.TASK DETAILhttps://phabricator.wikimedia.org/T210664EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: AfroThundr3007730

[Wikidata-bugs] [Maniphest] [Commented On] T210664: List of most used Wikidata entities

2019-01-25 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @abian You're welcome! Your analysis is awesome and could be used to exemplify how a Client/Manager/Editor/Owner should introduce the problem to a Data Scientist/Analyst!TASK DETAILhttps://phabricator.wikimedia.org/T210664EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T210664: List of most used Wikidata entities

2019-01-24 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @abian @Lydia_Pintscher We have the results. Method The power-law was estimated from 27,394,027 WD items that are currently used across the Wikimedia websites; that makes approximately 50% of items that are now present in WD (54,195,898 is the today's n

[Wikidata-bugs] [Maniphest] [Updated] T210664: List of most used Wikidata entities

2019-01-24 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. Update: we have the dataset (WD items, and for each item the number of pages which make use of it), and we're running the power-law estimation procedures now; as predicted that is going to take a while. N.B. I am running this on my private server becaus

[Wikidata-bugs] [Maniphest] [Closed] T214087: "I miss you" dysfunctions with the French Wiktionary

2019-01-23 Thread GoranSMilovanovic
GoranSMilovanovic closed this task as "Resolved". TASK DETAILhttps://phabricator.wikimedia.org/T214087EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: Otourly, Lea_Lacroix_WMDE, Lydia_Pintscher, GoranSMilovanovic, Aklappe

[Wikidata-bugs] [Maniphest] [Unblock] T202610: Cognate dashboard requests tracking

2019-01-23 Thread GoranSMilovanovic
GoranSMilovanovic closed subtask T214087: "I miss you" dysfunctions with the French Wiktionary as "Resolved". TASK DETAILhttps://phabricator.wikimedia.org/T202610EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Lea_Lacroix_WMDE, GoranSMil

[Wikidata-bugs] [Maniphest] [Commented On] T214087: "I miss you" dysfunctions with the French Wiktionary

2019-01-20 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Pamputt @Lea_Lacroix_WMDE Fixed. NOTE The dashboard's URL is changed, please update: http://wmdeanalytics.wmflabs.org/Wiktionary_CognateDashboard/TASK DETAILhttps://phabricator.wikimedia.org/T214087EMAIL PREFERENCEShttps://phabricator.wikimedia.org/set

[Wikidata-bugs] [Maniphest] [Updated] T214087: "I miss you" dysfunctions with the French Wiktionary

2019-01-18 Thread GoranSMilovanovic
GoranSMilovanovic added projects: WMDE-Analytics-Engineering, User-GoranSMilovanovic. TASK DETAILhttps://phabricator.wikimedia.org/T214087EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: Lea_Lacroix_WMDE, Lydia_Pintscher, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Claimed] T214087: "I miss you" dysfunctions with the French Wiktionary

2019-01-18 Thread GoranSMilovanovic
GoranSMilovanovic claimed this task. TASK DETAILhttps://phabricator.wikimedia.org/T214087EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: Lea_Lacroix_WMDE, Lydia_Pintscher, GoranSMilovanovic, Aklapper, Pamputt, Nandana, Lahi, Gq86

[Wikidata-bugs] [Maniphest] [Closed] T213637: Wikimedia Commons file and category pages should be counted in dashboard for percentage of articles making use of data from Wikidata

2019-01-15 Thread GoranSMilovanovic
GoranSMilovanovic closed this task as "Resolved".GoranSMilovanovic added a comment. @Mike_Peel You're welcome. Closing the ticket.TASK DETAILhttps://phabricator.wikimedia.org/T213637EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Gora

[Wikidata-bugs] [Maniphest] [Commented On] T213637: Wikimedia Commons file and category pages should be counted in dashboard for percentage of articles making use of data from Wikidata

2019-01-14 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. Tested, tests Ok. Running in production: dashboard. Please check this, let me know if this is what you've asked for, and if yes let's close the ticket. Thank you.TASK DETAILhttps://phabricator.wikimedia.org/T213637EMAIL PREFER

[Wikidata-bugs] [Maniphest] [Commented On] T213637: Wikimedia Commons file and category pages should be counted in dashboard for percentage of articles making use of data from Wikidata

2019-01-14 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. Done. Testing now. TASK DETAILhttps://phabricator.wikimedia.org/T213637EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: Mike_Peel, Aklapper, Lydia_Pintscher, Nandana, Lahi, Gq86, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Claimed] T213637: Wikimedia Commons file and category pages should be counted in dashboard for percentage of articles making use of data from Wikidata

2019-01-12 Thread GoranSMilovanovic
GoranSMilovanovic claimed this task.GoranSMilovanovic added a project: User-GoranSMilovanovic. TASK DETAILhttps://phabricator.wikimedia.org/T213637EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: Aklapper, Lydia_Pintscher, Nandana, Lahi

[Wikidata-bugs] [Maniphest] [Commented On] T210664: List of most used Wikidata entities

2019-01-10 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Hey, what is your take on this ticket and especially T210664#4860427? Thanks.TASK DETAILhttps://phabricator.wikimedia.org/T210664EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc

[Wikidata-bugs] [Maniphest] [Commented On] T210664: List of most used Wikidata entities

2019-01-07 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @abian Ok, but I need @Lydia_Pintscher to give me a go for this first.TASK DETAILhttps://phabricator.wikimedia.org/T210664EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: AfroThundr3007730

[Wikidata-bugs] [Maniphest] [Commented On] T210664: List of most used Wikidata entities

2019-01-07 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @abian @Lydia_Pintscher It would be very difficult to define a rational criterion of how many top frequently used WD items to protect. But maybe there is a way. Namely, the distribution of item usage, as you can observe, almost certainly follows the power-law

[Wikidata-bugs] [Maniphest] [Closed] T179286: WDCM Regular Updates

2019-01-06 Thread GoranSMilovanovic
GoranSMilovanovic closed this task as "Resolved".GoranSMilovanovic added a comment. @Lydia_Pintscher Following the developments on T210147: WDCM main update engine will run on weekly basis, synced to start 10 hours after the onset of the Sqoop procedure (i.e. transfer from MariaDB to

[Wikidata-bugs] [Maniphest] [Commented On] T191416: track usage of Wikibase Lua functions

2018-12-21 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Addshore if you mean "we need a Grafana dashboard", please provide an .MD file describing the metrics; if you mean Shiny, please let me know what schema do I need to access. TASK DETAILhttps://phabricator.wikimedia.org/T191416EMAIL PREFER

[Wikidata-bugs] [Maniphest] [Claimed] T210664: List of most used Wikidata entities

2018-11-29 Thread GoranSMilovanovic
GoranSMilovanovic claimed this task.GoranSMilovanovic added projects: User-GoranSMilovanovic, WMDE-Analytics-Engineering. TASK DETAILhttps://phabricator.wikimedia.org/T210664EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T210664: List of most used Wikidata entities

2018-11-28 Thread GoranSMilovanovic
GoranSMilovanovic added subscribers: Lydia_Pintscher, GoranSMilovanovic.GoranSMilovanovic added a comment. @Lydia_Pintscher Should we pick up this one? Please let me know.TASK DETAILhttps://phabricator.wikimedia.org/T210664EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] [Commented On] T193969: track percentage of articles making use of data from Wikidata

2018-11-28 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @DarTar Please expand in the report what's meant by "Wikidata usage", it's ambiguous and could be interpreted as items linked vs statement-level data reused via templates. The exact definition of what "Wikidata usage" refers to

[Wikidata-bugs] [Maniphest] [Commented On] T208569: Get Wikidata clickstream

2018-11-28 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lea_WMDE Just to clarify, I will put no further efforts here until you let me know what you think in respect to my insights in T208569#4767950.TASK DETAILhttps://phabricator.wikimedia.org/T208569EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] [Commented On] T208569: Get Wikidata clickstream

2018-11-23 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @JAllemandou So kind of you, thanks!TASK DETAILhttps://phabricator.wikimedia.org/T208569EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: JAllemandou, Aklapper, Lea_WMDE, Nandana, Lahi, Gq86

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T208569: Get Wikidata clickstream

2018-11-22 Thread GoranSMilovanovic
GoranSMilovanovic added a subscriber: JAllemandou.GoranSMilovanovic added a comment. @JAllemandou Hey, I need an insight into the production code for the Clickstream dataset, but I can't find the code repository anywhere. May you could help? Thanks. N.B. I am not looking for Python use cases

[Wikidata-bugs] [Maniphest] [Updated] T179286: WDCM Regular Updates

2018-11-22 Thread GoranSMilovanovic
GoranSMilovanovic added a parent task: T210147: Optimize WDCM update engines. TASK DETAILhttps://phabricator.wikimedia.org/T179286EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: Tobi_WMDE_SW, Addshore, Lydia_Pintscher, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Commented On] T206214: Basic data on Wikidata use: Edit counts, frequency

2018-11-18 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Jan_Dittrich Jan, please: do we need this ticket anymore? Thanks!TASK DETAILhttps://phabricator.wikimedia.org/T206214EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: Daniel_Mietchen, Lydia_Pintscher

[Wikidata-bugs] [Maniphest] [Updated] T208567: Count Wikidata page views per page type

2018-11-02 Thread GoranSMilovanovic
GoranSMilovanovic added a project: WMDE-Analytics-Engineering. TASK DETAILhttps://phabricator.wikimedia.org/T208567EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: Aklapper, WMDE-leszek, Lea_WMDE, Nandana, Lahi, Gq86, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Commented On] T206214: Basic data on Wikidata use: Edit counts, frequency

2018-11-01 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Jan_Dittrich Any feedback in relation to this? Shall I close the task as resolved or do you have any further requests in relation to this dataset?TASK DETAILhttps://phabricator.wikimedia.org/T206214EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings

[Wikidata-bugs] [Maniphest] [Commented On] T206214: Basic data on Wikidata use: Edit counts, frequency

2018-11-01 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Jan_Dittrich I did not attach this file in relation to T206214#4690469, didn't I? F26791999: WD_edits_Notebook.nb.htmlTASK DETAILhttps://phabricator.wikimedia.org/T206214EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferenc

[Wikidata-bugs] [Maniphest] [Updated] T205265: Investigate how we can measure interactions with WD item and property pages

2018-11-01 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lea_WMDE @Addshore My first insights: the wmf.webrequest table (schema) will not be on any help here: from there we can learn (a) when did someone hit a particular Wikidata page (item, property), and also, up to some degree, (b) where did the user come from

[Wikidata-bugs] [Maniphest] [Commented On] T205265: Investigate how we can measure interactions with WD item and property pages

2018-10-29 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lea_WMDE Following, understood, on it :)TASK DETAILhttps://phabricator.wikimedia.org/T205265EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: Addshore, GoranSMilovanovic, Aklapper, Lea_WMDE, Nandana, Lahi

[Wikidata-bugs] [Maniphest] [Updated] T205265: Investigate how we can measure interactions with WD item and property pages

2018-10-29 Thread GoranSMilovanovic
GoranSMilovanovic added projects: User-GoranSMilovanovic, WMDE-Analytics-Engineering. TASK DETAILhttps://phabricator.wikimedia.org/T205265EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: Addshore, GoranSMilovanovic, Aklapper, Lea_WMDE

[Wikidata-bugs] [Maniphest] [Commented On] T206214: Basic data on Wikidata use: Edit counts, frequency

2018-10-25 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Jan_Dittrich I did not attach this file in relation to T206214#4690469, didn't I? F26791999: WD_edits_Notebook.nb.htmlTASK DETAILhttps://phabricator.wikimedia.org/T206214EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferenc

[Wikidata-bugs] [Maniphest] [Updated] T193969: track percentage of articles making use of data from Wikidata

2018-10-23 Thread GoranSMilovanovic
GoranSMilovanovic added a subscriber: Daniel_Mietchen.GoranSMilovanovic added a comment. @Lydia_Pintscher Please see: T206214#4690482 from @Daniel_Mietchen (my bad the suggestion didn't get here; I've shared a wrong Phab ticket with Daniel).TASK DETAILhttps://phabricator.wikimedia.org/T1

[Wikidata-bugs] [Maniphest] [Commented On] T206214: Basic data on Wikidata use: Edit counts, frequency

2018-10-23 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Daniel_Mietchen My bad: I gave you the wrong Phab ticket for this. Sorry. Please: https://phabricator.wikimedia.org/T193969TASK DETAILhttps://phabricator.wikimedia.org/T206214EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To

[Wikidata-bugs] [Maniphest] [Commented On] T206214: Basic data on Wikidata use: Edit counts, frequency

2018-10-23 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Jan_Dittrich Here we go: please let me know whether the alternative visualization of edits vs. discussions in Q3.1 works for you; Q4.2 is now fixed: log10() used instead of the natural logarithm; data points are labeled in order to preserve the absolute values

[Wikidata-bugs] [Maniphest] [Commented On] T193969: track percentage of articles making use of data from Wikidata

2018-10-22 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Added some charts on the dashboard as previously agreed. Please let me know if any other data aggregates would be useful.TASK DETAILhttps://phabricator.wikimedia.org/T193969EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] [Commented On] T206214: Basic data on Wikidata use: Edit counts, frequency

2018-10-21 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Jan_Dittrich Here we go: – 1.1. "Q1.1 Checking for power-law behavior" has two log scaled axis, if I read it correctly. I do not get what the numerical labels on the y axis mean – is this number of users, but after log transformation, so 1 use

[Wikidata-bugs] [Maniphest] [Commented On] T206214: Basic data on Wikidata use: Edit counts, frequency

2018-10-19 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Jan_Dittrich Thank you for your comments. I will provide all necessary explanations later in the evening.TASK DETAILhttps://phabricator.wikimedia.org/T206214EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T206214: Basic data on Wikidata use: Edit counts, frequency

2018-10-16 Thread GoranSMilovanovic
GoranSMilovanovic added a subscriber: Lydia_Pintscher.GoranSMilovanovic added a comment. @Jan_Dittrich Please find your Report attached. Pinging @Lydia_Pintscher who might also be interested to take a look at the results. F26614255: WD_edits_Notebook.nb.html @Jan_Dittrich As of the following

[Wikidata-bugs] [Maniphest] [Claimed] T206214: Basic data on Wikidata use: Edit counts, frequency

2018-10-04 Thread GoranSMilovanovic
GoranSMilovanovic claimed this task.GoranSMilovanovic added projects: WMDE-Analytics-Engineering, User-GoranSMilovanovic.GoranSMilovanovic added a subscriber: RazShuty. TASK DETAILhttps://phabricator.wikimedia.org/T206214EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] [Commented On] T193969: track percentage of articles making use of data from Wikidata

2018-10-01 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher And the solutions is linked from https://grafana.wikimedia.org/dashboard/db/wikidata-entity-usage Let me know if you need anything here. Thanks.TASK DETAILhttps://phabricator.wikimedia.org/T193969EMAIL PREFERENCEShttps

[Wikidata-bugs] [Maniphest] [Commented On] T193969: track percentage of articles making use of data from Wikidata

2018-10-01 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Here's where the daily update can be found online: http://wdcm.wmflabs.org/WD_percentUsageDashboard/TASK DETAILhttps://phabricator.wikimedia.org/T193969EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferenc

[Wikidata-bugs] [Maniphest] [Commented On] T193969: track percentage of articles making use of data from Wikidata

2018-09-30 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Here's a table with the desired statistics. F26260965: wdUsage_ProjectStatistics.csv Columns: numPages = the number of pages from namespace = 0, no redirects; wdUsePages = the number of pages among them that make any use of Wikidata exce

[Wikidata-bugs] [Maniphest] [Updated] T205394: https://grafana.wikimedia.org/dashboard/db/wikidata-dump-downloads is broken

2018-09-25 Thread GoranSMilovanovic
GoranSMilovanovic added a project: User-GoranSMilovanovic. TASK DETAILhttps://phabricator.wikimedia.org/T205394EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: GoranSMilovanovic, Addshore, Aklapper, Lahi, Gq86, QZanden, LawExplorer

[Wikidata-bugs] [Maniphest] [Commented On] T204842: Track the usage of Wikidata entities on main namespaces pages in Wikimedia projects

2018-09-19 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @WMDE-leszek Anytime.TASK DETAILhttps://phabricator.wikimedia.org/T204842EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: WMDE-leszek, GoranSMilovanovicCc: GoranSMilovanovic, gerritbot, Aklapper, Lydia_Pintscher, Addshore

[Wikidata-bugs] [Maniphest] [Commented On] T204842: Track the usage of Wikidata entities on main namespaces pages in Wikimedia projects

2018-09-19 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @WMDE-leszek I should be able to provide for this data from slight modifications of the already existing WDCM engine scripts. Let me know if you'd like me to pick this one up.TASK DETAILhttps://phabricator.wikimedia.org/T204842EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T204438: finding statements that need a reference

2018-09-16 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Let me know if this is for me, and I'll claim it.TASK DETAILhttps://phabricator.wikimedia.org/T204438EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: GoranSMilovanovic, Akl

[Wikidata-bugs] [Maniphest] [Commented On] T204437: better understanding of conflicts around data

2018-09-16 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher This looks like... well... should I claim it? Please advise. Thanks.TASK DETAILhttps://phabricator.wikimedia.org/T204437EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Commented On] T203609: simplewiktionary provides wrong stats

2018-09-07 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Pamputt I think you need to take into your consideration that the dashboard's update cycle takes a while (see: T203609#4563662). Today, CET 12:45 approximately: source: frwiktionary target: simplewiktionary Expected result: entries prese

[Wikidata-bugs] [Maniphest] [Commented On] T203609: simplewiktionary provides wrong stats

2018-09-06 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. Hi @Pamputt of course they all exist. Once again, there is a difference between (a) the existence of page, and (b) the existence of the content of the page. All of the pages that you refer to exist on simple.wiktionary, and their content is : There is

[Wikidata-bugs] [Maniphest] [Commented On] T203609: simplewiktionary provides wrong stats

2018-09-06 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Pamputt Also, I have checked for: abotonaré. It does exist on the fr.wiktionary, and not on the simple.wiktionary. Which is exactly what the dashboard delivers when asked to compare for a source and a target Wiktionary: what is found in the target, but not found

[Wikidata-bugs] [Maniphest] [Commented On] T203609: simplewiktionary provides wrong stats

2018-09-06 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Pamputt From the instructions on the top of the Compare page: The Dashboard will generate a table of all entries found in the Target, but not in the Source Wiktionary. Now, we use source = simplewiktionary and target = frwiktionary, and expect to receive

[Wikidata-bugs] [Maniphest] [Commented On] T203609: simplewiktionary provides wrong stats

2018-09-06 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Pamputt I have checked for the abasia entry right now. As I have assumed: the dashboard "self-repaired" itself following one or two update cycles. Namely, by comparing simplewikt as source and frwikt as target, we know find: abasias (please note the

[Wikidata-bugs] [Maniphest] [Commented On] T203609: simplewiktionary provides wrong stats

2018-09-05 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Pamputt As ever, thank you for your input. "대신에" and "Aako" do not exist on both frwikt and simplewikt Both 대신에 and Aako exist on simple wiktionary. There is a difference between (1) a page that does not exist, and (2) a page that ex

[Wikidata-bugs] [Maniphest] [Edited] T198866: Uses of wb_terms SQL table to be migrated away from

2018-09-04 Thread GoranSMilovanovic
GoranSMilovanovic updated the task description. (Show Details) CHANGES TO TASK DESCRIPTION...## Planned/No longer using: Wikidata Concept Monitor (WDCM) * Used to get English labels of items to be displayed from wb_terms. Now uses MW API. * However, fetching a large number of labels from the MW

[Wikidata-bugs] [Maniphest] [Commented On] T144010: Drop eu_touched in production

2018-09-03 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Marostegui Thank you!TASK DETAILhttps://phabricator.wikimedia.org/T144010EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Marostegui, GoranSMilovanovicCc: GoranSMilovanovic, Stashbot, PokestarFan, Marostegui, jcrespo

[Wikidata-bugs] [Maniphest] [Commented On] T144010: Drop eu_touched in production

2018-09-03 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Marostegui I see that in enwiki there is no more eu_touched in the wbc_entity_usage table. This made one of my crucial scripts running from stat1004 statbox crash, and the whole WDCM system is thus not being updated. Could you please confirm that these changes

[Wikidata-bugs] [Maniphest] [Commented On] T191416: track usage of Wikibase Lua functions

2018-08-17 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @hoo @Lydia_Pintscher As of this ticket: @hoo Thank you very much for your instructions! However, I am a contractor Data Scientist for WMDE, which in effects means: I can do things like R (what we use for analytics in WMDE), MATLAB, Octave, some Python for

[Wikidata-bugs] [Maniphest] [Commented On] T191416: track usage of Wikibase Lua functions

2018-08-09 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @hoo Thanks! I will get in touch with you ASAP on this.TASK DETAILhttps://phabricator.wikimedia.org/T191416EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: GoranSMilovanovic, hoo, Aklapper

[Wikidata-bugs] [Maniphest] [Commented On] T144010: Drop eu_touched in production

2018-07-25 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Marostegui Thank you very much! Can you estimate when would the new schema become operational?TASK DETAILhttps://phabricator.wikimedia.org/T144010EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Marostegui

[Wikidata-bugs] [Maniphest] [Commented On] T144010: Drop eu_touched in production

2018-07-25 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Marostegui Hi, could you please confirm the following: no other changes in the wbc_entity_usage schema will take place except for dropping the eu_touched field? The WDCM operations rely on weekly Apache Sqoop runs across the wbc_entity_usage tables from

[Wikidata-bugs] [Maniphest] [Commented On] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2018-06-25 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. Ok. Someone please ping me know when the relevant metrics are instantiated in Graphite. @Addshore As of the comments in respect to T154601#4275121 (wb_terms -> Big Data), well, obviously because it makes more sense to work with a proper Big Data solution than

[Wikidata-bugs] [Maniphest] [Unassigned] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2018-06-25 Thread GoranSMilovanovic
GoranSMilovanovic removed GoranSMilovanovic as the assignee of this task. TASK DETAILhttps://phabricator.wikimedia.org/T154601EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: WMDE-leszek, Aleksey_WMDE, Ivanhercaz, VIGNERON, Lydia_Pintscher

[Wikidata-bugs] [Maniphest] [Commented On] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2018-06-24 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher @Addshore @WMDE-leszek @Aleksey_WMDE Putting aside the question of the afterlife of the SQL wb_terms table for now: is there anything that can be done to fix this Dashboard anytime soon, or we wait for a new data engineering solution for the

[Wikidata-bugs] [Maniphest] [Commented On] T191416: track usage of Wikibase Lua functions

2018-06-21 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. I can pick this one up if someone can provide an introduction to Lua modules usage in Wikimedia projects for me. The current task description does not provide enough information on how to perform the task (where are the data: how does one learn which Lua module

[Wikidata-bugs] [Maniphest] [Commented On] T191424: track number of Lexemes and Forms

2018-06-20 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. Lexems moved to the right Y-axis with properties. TASK DETAILhttps://phabricator.wikimedia.org/T191424EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: gerritbot, Addshore, GoranSMilovanovic, Aklapper

[Wikidata-bugs] [Maniphest] [Commented On] T191424: track number of Lexemes and Forms

2018-06-20 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher @Addshore Lexeme usage is now present at https://grafana.wikimedia.org/dashboard/db/wikidata-datamodel (note: the data has just started to roll in, so you are looking for a point in the lower right corner of the graph). We need to understand

[Wikidata-bugs] [Maniphest] [Commented On] T191424: track number of Lexemes and Forms

2018-06-18 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Addshore https://gerrit.wikimedia.org/r/#/c/analytics/wmde/scripts/+/440901/ Q1. Where do you make this SQL calls from? It would help me to understand the workflow. Q2. Where from (and when) do you send the data obtained from SQL to Graphite? Q3. Would you

[Wikidata-bugs] [Maniphest] [Commented On] T191424: track number of Lexemes and Forms

2018-06-13 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher @Addshore The mystery is temporarily resolved: namely, the tracking for lexems and forms is still not implemented, c.f. https://github.com/wikimedia/analytics-wmde-scripts/blob/master/src/wikidata/site_stats/pages_by_namespace.php#L20 https

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2018-06-12 Thread GoranSMilovanovic
GoranSMilovanovic added subscribers: Aleksey_WMDE, WMDE-leszek.GoranSMilovanovic added a comment. @Addshore After reconsidering this, I have to state openly that I am against relying on JSON dumps as the only source of data. @WMDE-leszek @Aleksey_WMDE will also be interested to hear, I guess

[Wikidata-bugs] [Maniphest] [Commented On] T191424: track number of Lexemes and Forms

2018-06-12 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Modifying the existing Graphite query for Items and Properties (see T191424#4236348) to incorporate the Lexems in the following way: aliasSub( aliasSub( aliasSub( aliasByNode(daily.wikidata.site_stats.pages_by_namespace

[Wikidata-bugs] [Maniphest] [Updated] T196193: Create KNIME nodes to interact with Wikidata

2018-06-11 Thread GoranSMilovanovic
GoranSMilovanovic added a project: User-GoranSMilovanovic. TASK DETAILhttps://phabricator.wikimedia.org/T196193EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: GoranSMilovanovic, Lydia_Pintscher, abian, Aklapper, Lahi, Gq86, Soteriaspace

[Wikidata-bugs] [Maniphest] [Commented On] T196193: Create KNIME nodes to interact with Wikidata

2018-06-11 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @abian @Lydia_Pintscher KNIME integrates with R and Java nicely. Given that we already use R extensively to analyse Wikidata, it could be possible to build a set of R developed Wikidata nodes for KNIME, I guess. If it is ETL only that you need, orchestrating

[Wikidata-bugs] [Maniphest] [Commented On] T191424: track number of Lexemes and Forms

2018-05-28 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Modifying the existing Graphite query for Items and Properties (see T191424#4236348) to incorporate the Lexems in the following way: aliasSub( aliasSub( aliasSub( aliasByNode(daily.wikidata.site_stats.pages_by_namespace

[Wikidata-bugs] [Maniphest] [Commented On] T191424: track number of Lexemes and Forms

2018-05-28 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher In the Wikidata namespaces documentation: https://www.wikidata.org/wiki/Help:Namespaces there are no mentions of the Lexems and Forms namespaces. In order to have the current Graphite query for Items/Properties: aliasSub( aliasSub

[Wikidata-bugs] [Maniphest] [Commented On] T191424: track number of Lexemes and Forms

2018-05-28 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. I am on it.TASK DETAILhttps://phabricator.wikimedia.org/T191424EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: GoranSMilovanovic, Aklapper, Lydia_Pintscher, Lahi, Gq86, QZanden, LawExplorer, Wikidata

[Wikidata-bugs] [Maniphest] [Updated] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2018-05-28 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher @Addshore I've seen a lot of data re-engineering going on around the wb_terms SQL table recently. The WDCM system also experiences some problems when fetching from this table (check the WDCM dashboards and you will spot too many missing l

[Wikidata-bugs] [Maniphest] [Commented On] T191424: track number of Lexemes and Forms

2018-05-28 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Done. Please check the dashboard.TASK DETAILhttps://phabricator.wikimedia.org/T191424EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: GoranSMilovanovic, Aklapper, Lydia_Pintscher, Lahi

[Wikidata-bugs] [Maniphest] [Claimed] T191424: track number of Lexemes and Forms

2018-05-28 Thread GoranSMilovanovic
GoranSMilovanovic claimed this task. TASK DETAILhttps://phabricator.wikimedia.org/T191424EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: GoranSMilovanovic, Aklapper, Lydia_Pintscher, Lahi, Gq86, QZanden, LawExplorer, Wikidata-bugs, aude

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T193969: track percentage of articles making use of data from Wikidata

2018-05-06 Thread GoranSMilovanovic
GoranSMilovanovic added a subscriber: Addshore.GoranSMilovanovic added a comment. @Addshore I need a second opinion on the following, please. One of your generating scripts for this Grafana dashboard iterates across the project databases and counts the pages that make use of any aspects except &#

[Wikidata-bugs] [Maniphest] [Updated] T193969: track percentage of articles making use of data from Wikidata

2018-05-06 Thread GoranSMilovanovic
GoranSMilovanovic added a project: User-GoranSMilovanovic. TASK DETAILhttps://phabricator.wikimedia.org/T193969EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: Aklapper, Lydia_Pintscher, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer

[Wikidata-bugs] [Maniphest] [Claimed] T193969: track percentage of articles making use of data from Wikidata

2018-05-06 Thread GoranSMilovanovic
GoranSMilovanovic claimed this task. TASK DETAILhttps://phabricator.wikimedia.org/T193969EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: Aklapper, Lydia_Pintscher, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, Wikidata-bugs, aude

[Wikidata-bugs] [Maniphest] [Claimed] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2018-05-02 Thread GoranSMilovanovic
GoranSMilovanovic claimed this task. TASK DETAILhttps://phabricator.wikimedia.org/T154601EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: Ivanhercaz, VIGNERON, Lydia_Pintscher, GoranSMilovanovic, gerritbot, Addshore, Sjoerddebruin

[Wikidata-bugs] [Maniphest] [Commented On] T154601: Grafana: "wikidata-datamodel-terms" doesn't update anymore

2018-05-02 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher I'm on it.TASK DETAILhttps://phabricator.wikimedia.org/T154601EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: Ivanhercaz, VIGNERON, Lydia_Pintscher, GoranSMilovanovic, gerr

[Wikidata-bugs] [Maniphest] [Commented On] T184057: Productionize Generation of Wikidata maps and associated data (currently at https://tools.wmflabs.org/wikidata-analysis/)

2018-04-05 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Addshore Thanks a lot!TASK DETAILhttps://phabricator.wikimedia.org/T184057EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: GoranSMilovanovic, Addshore, Aklapper, Lahi, Gq86, QZanden, LawExplorer

[Wikidata-bugs] [Maniphest] [Updated] T191424: track number of Lexemes and Forms

2018-04-05 Thread GoranSMilovanovic
GoranSMilovanovic added a project: User-GoranSMilovanovic. TASK DETAILhttps://phabricator.wikimedia.org/T191424EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: GoranSMilovanovic, Aklapper, Lydia_Pintscher, Lahi, Gq86, Cinemantique, QZanden

[Wikidata-bugs] [Maniphest] [Updated] T191416: track usage of Wikibase Lua functions

2018-04-05 Thread GoranSMilovanovic
GoranSMilovanovic added a project: User-GoranSMilovanovic. TASK DETAILhttps://phabricator.wikimedia.org/T191416EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: GoranSMilovanovic, hoo, Aklapper, Lydia_Pintscher, Lahi, Gq86, QZanden

[Wikidata-bugs] [Maniphest] [Commented On] T184173: WD Feature Evaluation Plans

2018-03-15 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher @Jan_Dittrich What is the status here? Shall we close this ticket and then open new ones as soon as particular WD features need any testing/evaluation?TASK DETAILhttps://phabricator.wikimedia.org/T184173EMAIL PREFERENCEShttps

[Wikidata-bugs] [Maniphest] [Updated] T184057: Productionize Generatiopn of Wikidata maps and associated data (currently at https://tools.wmflabs.org/wikidata-analysis/)

2018-02-28 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Addshore I'm already making use of this data set to prototype the WDCM Biases Dashboard T184109, and the dashboard will continue to make use of it. However, given that the geo-coordinates do not tend to change every now and then, not updating this data se

[Wikidata-bugs] [Maniphest] [Updated] T119976: Track number of stubs on top 20 wikipedias

2018-02-28 Thread GoranSMilovanovic
GoranSMilovanovic added a project: User-GoranSMilovanovic. TASK DETAILhttps://phabricator.wikimedia.org/T119976EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc: GoranSMilovanovic, Esc3300, Liuxinyu970226, ChrisPins, Mbch331, Izno, aude

[Wikidata-bugs] [Maniphest] [Commented On] T187521: Optimize recentchanges and wbc_entity_usage table across wikis

2018-02-15 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Marostegui The R script that orchestrates Apache Sqoop connects to analytics-store.eqiad.wmnet by using my analytics-research-client.cnf credentials from stat1004 - I wouldn't know exactly the server to which that resolves.TASK DETAIL

[Wikidata-bugs] [Maniphest] [Commented On] T187521: Optimize recentchanges and wbc_entity_usage table across wikis

2018-02-15 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Marostegui m = 0, h = 0, dom = 7,14,21,29, mon = *, dow = *, i.e. every 7th, 14th, 21st, and 29th of the month, 00:00 UTC.TASK DETAILhttps://phabricator.wikimedia.org/T187521EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To

[Wikidata-bugs] [Maniphest] [Commented On] T187521: Optimize recentchanges and wbc_entity_usage table across wikis

2018-02-15 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Ladsgroup @Marostegui I have a cron job on stat1004 that Sqoops the wbc_entity_usage tables for all projects into a HiveQL table for the Wikidata Concepts Monitor pre-processing. The cron job runs on a weekly schedule. Please let me know if you think it would

[Wikidata-bugs] [Maniphest] [Commented On] T184173: WD Feature Evaluation Plans

2018-02-08 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher @Jan_Dittrich What is the status here: any new developments, any features to evaluate?TASK DETAILhttps://phabricator.wikimedia.org/T184173EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GoranSMilovanovicCc

[Wikidata-bugs] [Maniphest] [Closed] T184716: analyse template usage on Wiktionaries

2018-02-07 Thread GoranSMilovanovic
GoranSMilovanovic closed this task as "Resolved".GoranSMilovanovic added a comment. @Ladsgroup Thanks, Amir - and exactly as I have assumed. @Lydia_Pintscher I wanted to by-pass the analysis of the templatelinks tables because I would face the same problem there as I did in my early a

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T184716: analyse template usage on Wiktionaries

2018-02-07 Thread GoranSMilovanovic
GoranSMilovanovic added a subscriber: Ladsgroup.GoranSMilovanovic added a comment. @Lydia_Pintscher Hmm, I had some doubts on whether a case of "S" aspect Wikidata usage is equivalent to transclusion or not for templates. If you need these pages that @Ladsgroup pointed to web-scraped and

[Wikidata-bugs] [Maniphest] [Commented On] T184716: analyse template usage on Wiktionaries

2018-02-06 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher Please find enclosed the following: (1) Report on Wikimedia Template Usage in Wiktionaries, and (2) a full source data set used to produce everything that is found in the Report. The table is described in the text. Please let me know if you

[Wikidata-bugs] [Maniphest] [Commented On] T184716: analyse template usage on Wiktionaries

2018-02-06 Thread GoranSMilovanovic
GoranSMilovanovic added a comment. @Lydia_Pintscher You did, and I've learned that immediately from the WDCM Structure Dashboard (check-out: Make your own network tab, an enter any Wikidata item, or up to five of them). Now let's see about those templates.TASK D

<    3   4   5   6   7   8   9   >