Re: bigpetstore flink : parallelizing collections

2015-07-13 Thread jay vyas
ok. now ** my thoughts ** on this are that it should be synergistic with flink needs, rather than an orthogonal task that you guys help us with, so please keep us updated what your needs are so that the work is synergistic https://issues.apache.org/jira/browse/BIGTOP-1927 On Mon, Jul 13, 2015 at

Re: bigpetstore flink : parallelizing collections

2015-07-13 Thread Maximilian Michels
Absolutely. I see it as a synergistic process too. I just learned about BigTop. As for the packaging, I think Flink doesn't have very different demands compared to the other frameworks already integrated. As for the rest, I'm not familiar enough with BigTop. Currently, Henry is the only Flink

bigpetstore flink : parallelizing collections

2015-07-12 Thread jay vyas
Hi flink. Im happy to announce that ive done a small bit of initial hacking on bigpetstore-flink, in order to represent what we do in spark in flink. TL;DR the main question is at the bottom! Currently, i want to generate transactions for a list of customers. The generation of transactions is

Re: bigpetstore flink : parallelizing collections

2015-07-12 Thread Stephan Ewen
Hi Jay! You can use the fromCollection() or fromElements() method to create a DataSet or DataStream from a Java/Scala collection. That moves the data into the cluster and allows you to run parallel transformations on the elements. Make sure you set the parallelism of the operation that you want

Re: bigpetstore flink : parallelizing collections

2015-07-12 Thread jay vyas
awesome thanks ! i ll try it out. This is part of a wave of jiras for bigtop flink integration. If your distro/packaging folks collaborate with us - it will save you time in the long run, because you can piggy back the bigtop infra for rpm/deb packaging, smoke testing, and HDFS interop testing