My thoughts: 1. sounds good! 2. I feel it might be better to be separated so we can focus on one problem each time. 3. depending on how hard it is to add in future I feel. 4. not sure.
On Wed, May 9, 2018 at 7:39 AM, Saikat Kanjilal <sxk1...@hotmail.com> wrote: > FYI for those that dont know about Michaelangelo: https://eng.uber.com/ > michelangelo/ > > [http://eng.uber.com/wp-content/uploads/2017/09/Facebook.png]<https://eng. > uber.com/michelangelo/> > > Meet Michelangelo: Uber's Machine Learning Platform<https://eng.uber.com/ > michelangelo/> > eng.uber.com > Uber Engineering introduces Michelangelo, our machine > learning-as-a-service system that enables teams to easily build, deploy, > and operate ML solutions at scale. > > > > > ________________________________ > From: Saikat Kanjilal <sxk1...@hotmail.com> > Sent: Wednesday, May 9, 2018 7:35 AM > To: dev@heron.incubator.apache.org; Karthik Ramasamy > Subject: Re: [DISCUSS] A design proposal for incorporating machine > learning algorithms into heron > > Hi Folks, > > I was thinking about how to drive this initiative and had some ideas > around execution, would love some feedback: > > 1) While the discussion is happening around the design I was thinking of > building a little prototype with one of the algorithms , the prototype will > be a first cut representation of the design where we represent one > algorithm into a storm topology, when I look at the list of algorithms that > we're thinking about bringing over from samoa (https://samoa.incubator. > apache.org/documentation/SAMOA-and-Machine-Learning.html) the distributed > stream clustering looks the most valuable for a prototype, thoughts > Apache SAMOA and Machine Learning<https://samoa.incubator.apache.org/ > documentation/SAMOA-and-Machine-Learning.html> > samoa.incubator.apache.org > Apache SAMOA and Machine Learning. SAMOA’s main goal is to help developers > to create easily machine learning algorithms on top of any distributed > stream processing engine. > > > > > Apache SAMOA and Machine Learning<https://samoa.incubator.apache.org/ > documentation/SAMOA-and-Machine-Learning.html> > Apache SAMOA and Machine Learning<https://samoa.incubator.apache.org/ > documentation/SAMOA-and-Machine-Learning.html> > samoa.incubator.apache.org > Apache SAMOA and Machine Learning. SAMOA’s main goal is to help developers > to create easily machine learning algorithms on top of any distributed > stream processing engine. > > > > samoa.incubator.apache.org > Apache SAMOA and Machine Learning. SAMOA’s main goal is to help developers > to create easily machine learning algorithms on top of any distributed > stream processing engine. > > > 2) I would like to leverage some of the ideas in MichaelAngelo as well as > my previous experience in building a tool that versions, deploys and > associates ML models with newly arriving windows of data, in actuality I > feel like this is a completely orthogonal initiative that we also need to > design out, should this be part of the design doc at this point, thoughts? > > 3) Should we address security in streaming machine learning models for the > first release? > > 4) The design doc mentions a GenericMLOutputModelSink, I was thinking this > is like a factory method in that has underlying representations of various > sinks that already exist that I'm hoping to leverage, see here: > https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/ > bk_storm-component-guide/content/ch_storm-connectors.html > > > > @Karthik Ramasamy<mailto:kart...@streaml.io> et all, would love to get > thoughts on how we proceed with this initiative at this point, in the > meantime I will get started with 1 to test out the feasibility of this > design. > > Regards > > Chapter 5. Moving Data Into and Out of Apache Storm Using ...< > https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6. > 4/bk_storm-component-guide/content/ch_storm-connectors.html> > docs.hortonworks.com > This chapter focuses on moving data into and out of Apache Storm through > the use of spouts and bolts. Spouts read data from external sources to > ingest data into a topology. > > > > > > > ________________________________ > From: Saikat Kanjilal <sxk1...@hotmail.com> > Sent: Monday, May 7, 2018 2:31 PM > To: dev@heron.incubator.apache.org > Subject: [DISCUSS] A design proposal for incorporating machine learning > algorithms into heron > > > Hello Dev community, > > I have created the initial API design documentation around building storm > topologies around a set of machine learning streaming algorithms here: > https://docs.google.com/document/d/1LrO7XRcMxJoMM83wjRd- > Ov74VAaomA_mXOAhCStgGng/edit?usp=sharing, this is very much a work in > progress but I wanted to start getting early feedback from the community > as its a lot of complex operations representing a streaming ml pipeline > using heron. This design leverages apache samoa to figure out which > algorithms to focus on in bringing into heron. > > Thank you Karthik Ramasamy for your mentoring on this, the goal will be to > represent all the algorithms in phase 1 as storm topologies and then to > evolve this to building a streamlet based architecture would really > appreciate some feedback from the community > > While you guys are commenting on the initial approach I will : 1) finish > the design for the rest of the algorithms for phase 1 2) start the design > for building out a heron streamlet based architecture to run on top of the > storm based topologies. > > Look forward to a productive discussion around the design > >