Cool! At the moment I don't have any good use cases, but I will read some literature about it in the near future. The first priority for me is to make a good streaming iteration example, and Márton liked the machine-learning idea. That, and there is a group in SZTAKI that develops recommendation systems and we'd like to cooperate in order to implement some of their algorithms in Flink Streaming.
Peter 2015-02-26 23:30 GMT+01:00 Paris Carbone <par...@kth.se>: > We haven’t yet implemented any of these machine learning models directly > on the Flink api but we have run them through the existing Samoa tasks, > using Flink Streaming as a backend. Apart from it we have a student looking > into machine learning pipelines on Flink Streaming with a focus on > iterative jobs so we will have many more use cases coming soon. Are you > also considering looking into something similar? Perhaps I can help more if > you have some specific use case in mind. > > Paris > > > On 23 Feb 2015, at 14:29, Szabó Péter <nemderogator...@gmail.com<mailto: > nemderogator...@gmail.com>> wrote: > > Nice. Thank you guys! > > @Paris > Are there any Flink implementations of this model? The GitHub doc is quite > general. > > Peter > > 2015-02-23 14:05 GMT+01:00 Paris Carbone <par...@kth.se<mailto: > par...@kth.se>>: > > Hello Peter, > > Streaming machine learning algorithms make use of iterations quite widely. > One simple example is implementing distributed stream learners. There, in > many cases you need some central model aggregator, distributed estimators > to offload the central node and of course feedback loops to merge > everything back to the main aggregator periodically. One such example in > the Vertical Hoeffding Tree Classifier (VFDT) [1] that is implemented in > Samoa. > > Iterative streams are also useful for optimisation techniques as in batch > processing (eg. trying different parameters to estimate a variable, getting > back the accuracy from an evaluator and repeating until a condition is > achieved). > > I hope this helps to get a general idea of where iterations can be used. > > [1] https://github.com/yahoo/samoa/wiki/Vertical-Hoeffding-Tree-Classifier > > > On 23 Feb 2015, at 12:13, Stephan Ewen <se...@apache.org<mailto: > se...@apache.org><mailto: > se...@apache.org<mailto:se...@apache.org>>> wrote: > > I think that the Samoa people have quite a few nice examples along the > lines of model training with feedback. > > @Paris: What would be the simplest example? > > On Mon, Feb 23, 2015 at 11:27 AM, Szabó Péter <nemderogator...@gmail.com > <mailto:nemderogator...@gmail.com> > <mailto:nemderogator...@gmail.com>> > wrote: > > Does everyone know of a good, simple and realistic streaming iteration > example? The current example tests a random generator, but it should be > replaced by something deterministic in order to be testable. > > Peter > > > > >