This is awesome!! Very exciting to see the addition of statistical and data-mining algorithms to Apache Beam.
On Thu, Aug 3, 2017 at 2:32 PM, Eugene Kirpichov < [email protected]> wrote: > +1, Very exciting! I have some suggestions on the exact API to expose (e.g. > I think it makes sense to expose the CombineFn's directly, so that they can > also be used for combining state cells and not just as PTransforms), but > that can be handled during regular code review. > > On Thu, Aug 3, 2017 at 2:23 PM Sourabh Bajaj > <[email protected]> wrote: > > > +1 to this. > > > > On Thu, Aug 3, 2017 at 6:28 AM Lukasz Cwik <[email protected]> > > wrote: > > > > > I'm most interested in the frequency / cardinality tools as it could be > > > used to help improve performance automatically for combiners by > detecting > > > the few keys case or automatically handle hot keys without needing > users > > to > > > specify the hints when they use a combiner. > > > > > > On Thu, Aug 3, 2017 at 5:35 AM, Jean-Baptiste Onofré <[email protected]> > > > wrote: > > > > > > > Nice work Arnaud ;) > > > > > > > > Happy to have been able to help. > > > > > > > > Let's see what the others will think about this. > > > > > > > > Regards > > > > JB > > > > > > > > > > > > On 08/03/2017 02:32 PM, Arnaud Fournier wrote: > > > > > > > >> Hello everyone, > > > >> > > > >> My name is Arnaud Fournier and I am a CS student. I am currently > doing > > > an > > > >> internship at Talend. > > > >> > > > >> With the support of Jean-Baptiste Onofre and Ismaël Mejia, I have > been > > > >> working on statistical analysis of streams with Beam, using > > > probabilistic > > > >> data structures like HyperLogLog. > > > >> > > > >> I would like to share this work with the community, but I wanted > first > > > to > > > >> show you my work in progress and ask you if this humble contribution > > > could > > > >> be interesting as an extension. > > > >> > > > >> I have made a little doc with more details about what I have done in > > > case > > > >> you are interested and want to give me some feedback : > > > >> *https://docs.google.com/document/d/1Xy6g5RPBYX_HadpIr_2WrUe > > > >> usiwL0Jo2ACI5PEOP1kc/edit* > > > >> <https://docs.google.com/document/d/1Xy6g5RPBYX_HadpIr_2WrUe > > > >> usiwL0Jo2ACI5PEOP1kc/edit> > > > >> > > > >> You can also find the current work implementation in progress here > : > > > >> > > > >> https://github.com/ArnaudFnr/beam/tree/sketching/sdks/java/e > > > >> xtensions/sketching > > > >> > > > >> > > > >> <https://github.com/ArnaudFnr/beam/tree/sketching/sdks/java/ > > > >> extensions/sketching> > > > >> > > > >> Thanks ! > > > >> > > > >> Arnaud > > > >> > > > >> > > > > -- > > > > Jean-Baptiste Onofré > > > > [email protected] > > > > http://blog.nanthrax.net > > > > Talend - http://www.talend.com > > > > > > > > > >
