Team, Since everyone is here, we will be working on a machine learning infrastructure program this year. I will set up meetings with everyone on this thread and some others in SRE and Audiences to get a "bag of requests" of things that are missing, first round of talks that I hope to finish next week is to hear what everyone requests/ideas are. Will be sending meeting invites today and tomorrow. I think from those some themes will emerge. Thus far, it is pretty clear we need a better way to deploy models to production (right now we deploy those to elastic search in very crafty manners, for example) , we need to have an answer to GPU issues to train models, we need to have a "recommended way" in which we train and compute, some unified system for tracking models+data+tests and finally, there are probably many learnings the work been done in Ores thus far.
Thanks, Nuria On Thu, Feb 7, 2019 at 8:40 AM Miriam Redi <[email protected]> wrote: > Hey Andrew! > > Thank you so much for sharing this and start this conversation. We had a > meeting at All Hands with all people interested in "Image Classification" > https://phabricator.wikimedia.org/T215413 , and one of the open questions > was exactly how to find a "common repository" for ML models that different > groups and products within the organization can use. So, please, count me > in! > > Thanks, > > M > > > On Thu, Feb 7, 2019 at 4:38 PM Aaron Halfaker <[email protected]> > wrote: > >> Just gave the article a quick read. I think this article pushes on some >> key issues for sure. I definitely agree with the focus on python/jupyter >> as essential for a productive workflow that leverages the best from >> research scientists. We've been thinking about what ORES 2.0 would look >> like and event streams are the dominant proposal for improving on the >> limitations of our queue-based worker pool. >> >> One of the nice things about ORES/revscoring is that it provides a nice >> framework for operating using the *exact same code* no matter the >> environment. E.g. it doesn't matter if we're calling out to an API to get >> data for feature extraction or providing it via a stream. By investing in >> a dependency injection strategy, we get that flexibility. So to me, the >> hardest problem -- the one I don't quite know how to solve -- is how we'll >> mix and merge streams to get all of the data we want available for feature >> extraction. If I understand correctly, that's where Kafka shines. :) >> >> I'm definitely interested in fleshing out this proposal. We should >> probably be exploring the processes for training new types of models (e.g. >> image processing) using different strategies than ORES. In ORES, we're >> almost entirely focused on using sklearn but we have some basic >> abstractions for other estimator libraries. We also make some strong >> assumptions about running on a single CPU that could probably be broken for >> some performance gains using real concurrency. >> >> -Aaron >> >> On Thu, Feb 7, 2019 at 10:05 AM Goran Milovanovic < >> [email protected]> wrote: >> >>> Hi Andrew, >>> >>> I have recently started a six month AI/Machine Learning Engineering >>> course which focuses exactly on the topics that you've shown interest in. >>> >>> So, >>> >>> >>> I'd love it if we had a working group (or whatever) that focused >>> on how to standardize how we train and deploy ML for production use. >>> >>> Count me in. >>> >>> Regards, >>> Goran >>> >>> >>> Goran S. Milovanović, PhD >>> Data Scientist, Software Department >>> Wikimedia Deutschland >>> >>> ------------------------------------------------ >>> "It's not the size of the dog in the fight, >>> it's the size of the fight in the dog." >>> - Mark Twain >>> ------------------------------------------------ >>> >>> >>> On Thu, Feb 7, 2019 at 4:16 PM Andrew Otto <[email protected]> wrote: >>> >>>> Just came across >>>> >>>> https://www.confluent.io/blog/machine-learning-with-python-jupyter-ksql-tensorflow >>>> >>>> In it, the author discusses some of what he calls the 'impedance >>>> mismatch' between data engineers and production engineers. The links to >>>> Ubers Michelangelo <https://eng.uber.com/michelangelo/> (which as far >>>> as I can tell has not been open sourced) and the Hidden Technical Debt >>>> in Machine Learning Systems paper >>>> <https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf> >>>> are >>>> also very interesting! >>>> >>>> At All hands I've been hearing more and more about using ML in >>>> production, so these things seem very relevant to us. I'd love it if we >>>> had a working group (or whatever) that focused on how to standardize how we >>>> train and deploy ML for production use. >>>> >>>> :) >>>> _______________________________________________ >>>> Analytics mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>> >> >> -- >> >> Aaron Halfaker >> >> Principal Research Scientist >> >> Head of the Scoring Platform team >> Wikimedia Foundation >> _______________________________________________ >> Research-Internal mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/research-internal >> > _______________________________________________ > Research-Internal mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/research-internal >
_______________________________________________ Discovery mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/discovery
