Re: Data Preprocessing in Beam

2018-11-01 Thread Ismaël Mejía
SInce the extenson (module) will probably be in the form of new PTransforms to Beam, it is worth to take a look at: https://beam.apache.org/contribute/ptransform-style-guide/ and of course to: https://beam.apache.org/contribute/ On Wed, Oct 31, 2018 at 6:57 PM Alex wrote: > > Great. Thank you. >

Re: Data Preprocessing in Beam

2018-10-31 Thread Kenneth Knowles
The word "extension" doesn't really mean anything in the case of Beam. It is just a library. You can use the build set up of other libraries as examples. Kenn On Wed, Oct 31, 2018 at 10:23 AM Alejandro wrote: > Hello, > > I am going to get familiarized on how to write a Beam extension then, >

Re: Data Preprocessing in Beam

2018-10-31 Thread Alejandro
Hello, I am going to get familiarized on how to write a Beam extension then, although right now I am a little busy searching for a new job :-/. I hope in a few weeks (Lets hope it doesn't take much longer to find a job) I can get hands on it this and contribute with this preprocessing extension

Re: Data Preprocessing in Beam

2018-10-31 Thread Ismaël Mejía
Hello, I mentored Arnaud to contribute the sketching extension into Beam and from a quick look at Alex paper + implementation, I think this should be an independent extension. Sketching is a collection of transforms that rely on probabilistic data structures to give approximate results and

Re: Data Preprocessing in Beam

2018-10-25 Thread Alex
Great! Right now there is a lot on that code I do not understand, hope in the next days I can document myself. Should I reimplement my algorithms in Scala? Or could I create a wrapper that interface with the sketching extension? Cheers.On Oct 24, 2018 15:00, Maximilian Michels wrote: > >

Re: Data Preprocessing in Beam

2018-10-24 Thread Maximilian Michels
Welcome Alejandro! Interesting work. The sketching extension looks like a good place for your algorithms. -Max On 23.10.18 19:05, Lukasz Cwik wrote: Arnoud Fournier (afourn...@talend.com ) started by adding a library to support sketching

Re: Data Preprocessing in Beam

2018-10-23 Thread Lukasz Cwik
Arnoud Fournier (afourn...@talend.com) started by adding a library to support sketching ( https://github.com/apache/beam/tree/master/sdks/java/extensions/sketching), I feel as those some of these could be added there or possibly within another extension. On Tue, Oct 23, 2018 at 9:54 AM Austin

Data Preprocessing in Beam

2018-10-23 Thread Austin Bennett
Hi Beam Devs, Alejandro, copied, is an enthusiastic developer, who recently coded up: https://github.com/elbaulp/DPASF (associated paper found: https://arxiv.org/abs/1810.06021). He had been looking to contribute that code to FlinkML, at which point I found him and alerted him to Beam. He has