Re: DataStreamUtils and Scala

2016-02-01 Thread Márton Balassi
I'll do the fix(es) tomorrow morning. Best, Marton On Mon, Feb 1, 2016, 20:05 Cory Monty wrote: > Thanks for the quick response! > > Either solution works for us. > > On Mon, Feb 1, 2016 at 12:07 PM, Stephan Ewen wrote: > >> I would actually re-add the "getJavaStream()" method. There are prob

Re: DataStreamUtils and Scala

2016-02-01 Thread Cory Monty
Thanks for the quick response! Either solution works for us. On Mon, Feb 1, 2016 at 12:07 PM, Stephan Ewen wrote: > I would actually re-add the "getJavaStream()" method. There are probably > other cases when people need it, and it does not hurt to expose it. > > Also, it allows you to combine p

Re: Redeployements and state

2016-02-01 Thread Ufuk Celebi
> On 01 Feb 2016, at 17:14, Don Frascuchon wrote: > > Hi, > > In reference with this topic, there is any feature for automatically restart > job after a task exception? Like --supervise command line option in apache > spark If you are referring to job manager/task manager instances: No. Cur

Re: join with no element appearing in multiple join-pairs

2016-02-01 Thread Fridtjof Sander
I still have them. To be clear: I do need to compare each element with its successor, I just can't have one element to be paired multiple times in the same step. That's why I divide the join into two steps: data: x0 -> x1 -> x2 The first join will only pair (x0, x1). These may or may not be

Re: DataStreamUtils and Scala

2016-02-01 Thread Stephan Ewen
I would actually re-add the "getJavaStream()" method. There are probably other cases when people need it, and it does not hurt to expose it. Also, it allows you to combine programs written partly against the Java and Scala API (if you would ever want to do that). Stephan On Mon, Feb 1, 2016 at

DataStreamUtils and Scala

2016-02-01 Thread Cory Monty
Hey there, We were using DataStreamUtils.collect in Scala for automated testing, which only works because of `DataStream.getJavaStream` accessor in the Scala version of `DataStream`. However, a recent commit ( https://github.com/apache/flink/commit/086acf681f01f2da530c04289e0682c56f98a378) removed

Re: DataStreamUtils and Scala

2016-02-01 Thread Márton Balassi
Hey Cory, Sorry, I did not mean to break your code. One solution that I could suggest is to do it the way we have it for the batch api, namely having a scala version for DataStreamUtils too. It might be placed under flink-contrib for the time being. Would that solution fit your needs? On Mon, Fe

Re: Redeployements and state

2016-02-01 Thread Don Frascuchon
Hi, In reference with this topic, there is any feature for automatically restart job after a task exception? Like --supervise command line option in apache spark Thanks in advance! El mar., 26 ene. 2016 a las 11:07, Ufuk Celebi () escribió: > Hey Niels! > > Stephan gave a very good summary of

Re: Not able to import MBoxParser in MailCount.java

2016-02-01 Thread subash basnet
Yup it worked thank you. Regards, Subash On Mon, Feb 1, 2016 at 12:18 PM, Stefano Baghino < stefano.bagh...@radicalbit.io> wrote: > [image: Boxbe] This message is eligible > for Automatic Cleanup! (stefano.bagh...@radicalbit.io) Add cleanup rule >

Re: join with no element appearing in multiple join-pairs

2016-02-01 Thread Till Rohrmann
In the described case, can it be that you still have elements with `id % 2 == 1` in your data set or are they filtered out? If they are filtered out, then you can simply shift the indices for each iteration to the right. On Mon, Feb 1, 2016 at 12:32 PM, Fridtjof Sander < fsan...@mailbox.tu-berlin.

Re: cluster execution

2016-02-01 Thread Lydia Ickler
xD… a simple "hdfs dfs -chmod -R 777 /users" fixed it! > Am 01.02.2016 um 12:17 schrieb Till Rohrmann : > > Hi Lydia, > > I looks like that. I guess you should check your hdfs access rights. > > Cheers, > Till > > On Mon, Feb 1, 2016 at 11:28 AM, Lydia Ickler

Re: join with no element appearing in multiple join-pairs

2016-02-01 Thread Fridtjof Sander
Hi Till, thanks for your reply! The problem with that is, that I sometimes combine two elements: So from x0 -> x1 -> x2 I join (x0, x1) which might become x0 -> x2 in the end. The indices from zipWithIndex then are 0 and 2, resulting in equal joins flags. Sequential elements always have to

Re: join with no element appearing in multiple join-pairs

2016-02-01 Thread Till Rohrmann
Hi Fridtjof, I might miss something, but can’t you assign the ids once before starting the iteration and then reuse them throughout the iterations? Of course you would have to add another field to your input data but then you don’t have to run the zipWithIndex for every iteration. Cheers, Till ​

Re: Not able to import MBoxParser in MailCount.java

2016-02-01 Thread Stefano Baghino
Hi Subash, have you cloned the flink-training-exercises project from GitHub? It's not available on Maven Central, so you have to clone it and install it locally with mvn clean install. You can find more info about it here: http://dataartisans.github.io/flink-training/devSetup/handsOn.html#clone-an

Re: cluster execution

2016-02-01 Thread Till Rohrmann
Hi Lydia, I looks like that. I guess you should check your hdfs access rights. Cheers, Till On Mon, Feb 1, 2016 at 11:28 AM, Lydia Ickler wrote: > Hi Till, > > thanks for your reply! > I tested it with the Wordcount example. > Everything works fine if I run the command: > ./flink run -p 3 /hom

Not able to import MBoxParser in MailCount.java

2016-02-01 Thread subash basnet
Hello, I copied the plugin configuration for MBoxParser to my pom.xml file from plugin_link but there is no addition of library. So there is error: the import com.dataartisans cannot be resolved. Below is my pom.xml fil

Re: join with no element appearing in multiple join-pairs

2016-02-01 Thread Fridtjof Sander
(tried to reformat) Hi, I have a problem which seems to be unsolvable in Flink at the moment (1.0-Snapshot, current master branch) and I would kindly ask for some input, ideas on alternative approaches or just a confirmatory "yup, that doesn't work". ### Here's the situation: I have a datas

join with no element appearing in multiple join-pairs

2016-02-01 Thread Fridtjof Sander
Hi, I have a problem which seems to be unsolvable in Flink at the moment (1.0-Snapshot, current master branch) and I would kindly ask for some input, ideas on alternative approaches or just a confirmatory "yup, that doesn't work". ### Here's the situation: I have a dataset and its elements ar

Re: cluster execution

2016-02-01 Thread Lydia Ickler
Hi Till, thanks for your reply! I tested it with the Wordcount example. Everything works fine if I run the command: ./flink run -p 3 /home/flink/examples/WordCount.jar Then the program gets executed by my 3 workers. If I want to save the output to a file: ./flink run -p 3 /home/flink/examples/Wor

RE: Left join with unbalanced dataset

2016-02-01 Thread LINZ, Arnaud
Hi, Thanks, I can’t believe I missed the outer join operators… Will try them and will keep you informed. I use the “official” 0.10 release from the maven repo. The off-heap memory I use is the one HDFS I/O uses (codec, DFSOutputstream threads…), but I don’t have many open files at once, and doub