Re: [Analytics] Onboarding our data scientist: Getting started material created

2017-03-20 Thread Goran Milovanovic
These are wikis , feel free to update: https://wikitech.wikim > edia.org/wiki/Analytics/Onboarding#Wikitech.2FLabs as needed. As I said > we update onboarding docs every time we hire and thus these are going to be > 4 months out of date. > > Thanks, > > Nuria > >

Re: [Analytics] Onboarding our data scientist: Getting started material created

2017-03-20 Thread Goran Milovanovic
Thanks. I am documenting all the steps, and of course I will share once when everything is settled and I feel confident enough about my own understanding of the process. Regards, Goran On 20 Mar 2017 20:12, "Pine W" wrote: > As a general comment, making onboarding

Re: [Analytics] Onboarding our data scientist: Getting started material created

2017-03-20 Thread Goran Milovanovic
Hi there, I am the Data Analyst referred to in: "WMDE will onboard their data analyst soon" : ) Please, just a quick check to see whether I get the things right - feedback appreciated: - I need to open an Wikitech account (DONE); - I need to register for ToolLabs (DONE; Waiting for Admin

Re: [Analytics] [Engineering] [Analytics Cluster] Downtime announcement for Oozie/Hive - Dec 7 10AM CET

2017-12-07 Thread Goran Milovanovic
Hi Luca, well, given that you are already have to deal with Hive today, just to report back that I have had a few situations with the HS2 server rejecting my queries in the previous days, reporting back that the most likely reason is the number of open connections. I guess some defensive

Re: [Analytics] [Engineering] Hadoop Cluster Maintenance - Now

2018-02-13 Thread Goran Milovanovic
May the Force be with you Goran S. Milovanović, PhD Data Scientist, Software Department Wikimedia Deutschland "It's not the size of the dog in the fight, it's the size of the fight in the dog." - Mark Twain

Re: [Analytics] Hive and Oozie unavailable for a brief hardware maintenance on Sept 7th

2018-09-10 Thread Goran Milovanovic
Hi, I am sorry if it turns out that you have already informed us about this: please, is the cluster reboot completed? I have a lot of Hive jobs waiting. Thanks, Goran On Fri, Sep 7, 2018, 13:57 Luca Toscano wrote: > Hi again, > > the maintenance didn't happen because we currently have

Re: [Analytics] Hive and Oozie unavailable for a brief hardware maintenance on Sept 7th

2018-09-10 Thread Goran Milovanovic
ld take a maximum of 2 hours > from now. Apologies for the inconvenience. > > Luca > > Il giorno lun 10 set 2018 alle ore 13:08 Goran Milovanovic < > goran.milovanovic_...@wikimedia.de> ha scritto: > >> Hi, >> >> I am sorry if it turns out that you ha

Re: [Analytics] Wiktionary word page views?

2018-10-23 Thread Goran Milovanovic
@James Salsman I am not sure if we have a tool somewhere designed specifically for that purpose, but you can get many important statistics on Wiktionary from http://wdcm.wmflabs.org/Wiktionary_CognateDashboard/ Regards, Goran Goran S. Milovanović, PhD Data Scientist, Software Department

Re: [Analytics] Article about ML in production woes

2019-02-07 Thread Goran Milovanovic
Hi Andrew, I have recently started a six month AI/Machine Learning Engineering course which focuses exactly on the topics that you've shown interest in. So, >>> I'd love it if we had a working group (or whatever) that focused on how to standardize how we train and deploy ML for production use.

Re: [Analytics] [Data Release] Active Editors by country

2019-11-07 Thread Goran Milovanovic
Bravo Dan & Analytics team! On Thu, Nov 7, 2019, 19:58 Dan Andreescu wrote: > Today we are releasing a new dataset meant to help us understand the > impact of grants and programs on editing. This data was requested several > years ago, and we took a long time to bring in the privacy and

Re: [Analytics] Home directories of users belonging to analytics-privatedata-users will change permissions

2020-03-03 Thread Goran Milovanovic
Hi Luca, I do not understand how exactly wozld the suggested change impact my work on the stat100* machines, but I know that I need both - user analytics-privatedata, and - user goransm to be able to read and write any file in any directory in my home directory. Thanks. Best, Goran On Tue,

Re: [Analytics] Home directories of users belonging to analytics-privatedata-users will change permissions

2020-03-03 Thread Goran Milovanovic
atedata is part of analytics-privatedata-users so > everything will keep working :) > > Luca > > Il giorno mar 3 mar 2020 alle ore 19:11 Goran Milovanovic < > goran.s.milovano...@gmail.com> ha scritto: > >> Hi Luca, >> >> I do not understand how exactly wozld the

Re: [Analytics] [Research-Internal] Tutorials on disk space usage for notebook/stat boxes

2020-02-25 Thread Goran Milovanovic
Great job Luca. Thank you very much. I have started to diversify all WMDE Analytics jobs (mainly Wikidata related things) across the stat100* machines. While I still mainly use stat1007, two modules of the WDCM system are already

Re: [Analytics] Wiki comparison 2020 data is available

2021-02-24 Thread Goran Milovanovic
lease be aware that we may add or change columns in the future as needs > evolve. > > Warm regards, > Kate > > On Tue, Feb 23, 2021 at 12:37 PM Goran Milovanovic < > goran.milovanovic_...@wikimedia.de> wrote: > >> Well, it would be desirable to maintain c

Re: [Analytics] Wiki comparison 2020 data is available

2021-02-23 Thread Goran Milovanovic
Well, it would be desirable to maintain consistent column names across the years... Best, Goran Goran S. Milovanović, PhD Data Scientist, Software Department Wikimedia Deutschland "It's not the size of the dog in the fight, it's the size of the