Re: Simple Flink - Kafka Test

2016-02-10 Thread Stephan Ewen
Yes, 0.10.x does not always have Scala version suffixes. 1.0 is doing this consistently, should cause less confusion... On Wed, Feb 10, 2016 at 2:38 PM, shotte wrote: > Ok It is working now > > I had to change a few dependency with the _2.11 suffix > > Thanks > >

Re: TaskManager unable to register with JobManager

2016-02-10 Thread Ravinder Kaur
Hello Fabian, Thank you very much for the resource. I had already gone through this and have found port '6123' as default for taskmanager registration. But I want to know the specific range of ports the taskmanager access during job execution. The taskmanager always tries to access a random port

Re: Simple Flink - Kafka Test

2016-02-10 Thread shotte
Ok It is working now I had to change a few dependency with the _2.11 suffix Thanks Sylvain org.apache.flink flink-java ${flink.version}

Re: TaskManager unable to register with JobManager

2016-02-10 Thread Stephan Ewen
Note that some of these config options are only available starting from version 1.0-SNAPSHOT On Wed, Feb 10, 2016 at 2:16 PM, Fabian Hueske wrote: > Hi Ravinder, > > please have a look at the configuration documentation: > > --> >

Re: Flink 1.0-SNAPSHOT scala 2.11 in S3 has scala 2.10

2016-02-10 Thread David Kim
Hi Chiwan, Max, Thanks for checking. I also downloaded it now and verified the 2.10 jar is gone :) A new build must have overwrote yesterday's and corrected itself. flink-1.0-SNAPSHOT-bin-hadoop2_2.11.tgz 2016-02-10T15:55:33.000Z Thanks! David On Wed, Feb 10, 2016 at 4:44 AM, Maximilian

Compilation error while instancing FlinkKafkaConsumer082

2016-02-10 Thread Simone Robutti
Hello, the compiler has been raising an error since I added this line to the code val testData=streamEnv.addSource(new FlinkKafkaConsumer082[String]("data-input",new SimpleStringSchema(),kafkaProp)) Here is the error: Error:scalac: Class

Re: How to convert List to flink DataSet

2016-02-10 Thread subash basnet
Hello Fabian, As written before code: *DataSet fElements = env.fromCollection(findOutliers(clusteredPoints, finalCentroids));fElements.writeAsCsv(outputPath, "\n", " ");env.execute("KMeans Example");* I am very new to flink so not so clear about what you suggested, by option(1) you meant that

Re: Flink 1.0-SNAPSHOT scala 2.11 in S3 has scala 2.10

2016-02-10 Thread Stephan Ewen
We discovered yesterday that the snapshot builds were not updated in a while (because the build server experienced timeouts). Hence the SNAPSHOT build may have quite stale. It is updating frequently again now, that's probably why you find a correct build today... On Wed, Feb 10, 2016 at 5:31

Re: Flink 1.0-SNAPSHOT scala 2.11 in S3 has scala 2.10

2016-02-10 Thread Maximilian Michels
Hi David, Just had a check as well. Can't find a 2.10 Jar in the lib folder. Cheers, Max On Wed, Feb 10, 2016 at 6:17 AM, Chiwan Park wrote: > Hi David, > > I just downloaded the "flink-1.0-SNAPSHOT-bin-hadoop2_2.11.tgz” but there is > no jar compiled with Scala 2.10.

Re: How to convert List to flink DataSet

2016-02-10 Thread subash basnet
Hello Stefano, Yeah the type casting worked, thank you. But not able to print the Dataset to the file. The default below code which writes the KMeans points along with their centroid numbers to the file works fine: // feed new centroids back into next iteration DataSet

Re: Dataset filter improvement

2016-02-10 Thread Flavio Pompermaier
Because The classes are not related to each other. Do you think it's a good idea to have something like this? abstract class BaseClass(){ String someField; } class ExtendedClass1 extends BaseClass (){ String someOtherField11; String someOtherField12; String someOtherField13; ... }

Re: Dataset filter improvement

2016-02-10 Thread Stephan Ewen
Why not use an abstract base class and N subclasses? On Wed, Feb 10, 2016 at 10:05 AM, Fabian Hueske wrote: > Unfortunately, there is no Either<1,...,n>. > You could implement something like a Tuple3 Option>. However, Flink does not provide an Option type

Re: How to convert List to flink DataSet

2016-02-10 Thread Fabian Hueske
Hi Subash, how is findOutliers implemented? It might be that you mix-up local and cluster computation. All DataSets are processed in the cluster. Please note the following: - ExecutionEnvironment.fromCollection() transforms a client local connection into a DataSet by serializing it and sending

Re: Join two Datasets --> InvalidProgramException

2016-02-10 Thread Dominique Rondé
Hi Fabian, your hint was good! Maven fools me with the dependency management. Now everything works as expected! Many many thanks to all of you! Greets Dominique Am 10.02.2016 um 08:45 schrieb Fabian Hueske: Hi Dominique, can you check if the versions of the remotely running job manager &

Re: Dataset filter improvement

2016-02-10 Thread Fabian Hueske
Hi Flavio, I did not completely understand which objects should go where, but here are some general guidelines: - early filtering is mostly a good idea (unless evaluating the filter expression is very expensive) - you can use a flatMap function to combine a map and a filter - applying multiple

Re: Dataset filter improvement

2016-02-10 Thread Flavio Pompermaier
Yes, the intermediate dataset I create then join again between themselves. What I'd need is a Either<1,...,n>. Is that possible to add? Otherwise I was thinking to generate a Tuple2 and in the subsequent filter+map/flatMap deserialize only those elements I want to group togheter

Re: Dataset filter improvement

2016-02-10 Thread Fabian Hueske
Unfortunately, there is no Either<1,...,n>. You could implement something like a Tuple3. However, Flink does not provide an Option type (comes with Java8). You would need to implement it yourself incl. TypeInfo and Serializer. You can get some inspiration from the Either

Re: Dataset filter improvement

2016-02-10 Thread Flavio Pompermaier
What do you mean exactly..? Probably I'm missing something here..remember that I can specify the right subClass only after the last flatMap, after the first map neither me nor Flink can know the exact subclass of BaseClass On Wed, Feb 10, 2016 at 12:42 PM, Stephan Ewen wrote:

Re: Dataset filter improvement

2016-02-10 Thread Stephan Ewen
Class hierarchies should definitely work, even if the base class has no fields. They work more efficiently if you register the subclasses at the execution environment (Flink cannot infer them from the function signatures because the function signatures only contain the abstract base class). On

Re: TaskManager unable to register with JobManager

2016-02-10 Thread Ravinder Kaur
Hello All, I need to know the range of ports that are being used during the master/slave communication in the Flink cluster. Also is there a way I can specify a range of ports, at the slaves, to restrict them to connect to master only in this range? Kind Regards, Ravinder Kaur On Wed, Feb 3,

Re: How to convert List to flink DataSet

2016-02-10 Thread Fabian Hueske
Hi Subash, I would not fetch the data to the client, do the computation there, and send it back, just for the purpose of writing it to a file. Either 1) pull the results to the client and write the file from there or 2) compute the outliers in the cluster. I did not study your code completely,

Re: Simple Flink - Kafka Test

2016-02-10 Thread shotte
Hi, Thanks for your reply, but I am still a bit confuse. I have downloaded flink-0.10.1-bin-hadoop27-scala_2.11.tgz and kafka_2.11-0.9.0.0.tgz I did not install myself Scala Now tell me if I understand correctly. Depending on the version of Flink I have (in my case the scala 2.11) I must

Re: TaskManager unable to register with JobManager

2016-02-10 Thread Fabian Hueske
Hi Ravinder, please have a look at the configuration documentation: --> https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#jobmanager-amp-taskmanager Best, Fabian 2016-02-10 13:55 GMT+01:00 Ravinder Kaur : > Hello All, > > I need to know the range of