Re: Job fails with FileNotFoundException from blobStore

2015-02-05 Thread Till Rohrmann
Hi Robert, thanks for the info. If the TaskManager/JobManager does not shutdown properly, i.e. killing of the process, then it is indeed the case that the BlobManager cannot properly remove all stored files. I don't know if this was lately the case for you. Furthermore, the files are not directly

Re: Runtime generated (source) datasets

2015-01-21 Thread Till Rohrmann
Hi Flavio, if your question was whether you can write a Flink job which can read input from different sources, depending on the user input, then the answer is yes. The Flink job plans are actually generated at runtime so that you can easily write a method which generates a user dependent

Re: Runtime generated (source) datasets

2015-01-21 Thread Till Rohrmann
: ListDatasetElementType getInput(String[] args, ExecutionEnvironment env) {} So I don't know in advance how many of them I'll have at runtime. Does it still work? On Wed, Jan 21, 2015 at 1:55 PM, Till Rohrmann trohrm...@apache.org wrote: Hi Flavio, if your question was whether you can

Re: Community vote for Hadoop Summit result

2015-01-30 Thread Till Rohrmann
Great news Márton. Congrats! On Jan 30, 2015 2:41 AM, Henry Saputra henry.sapu...@gmail.com wrote: W00t! Congrats guys! On Thu, Jan 29, 2015 at 4:06 PM, Márton Balassi mbala...@apache.org wrote: Hi everyone, Thanks for your support for the Flink talks at the community choice for the

Re: Nested Iterations supported in Flink?

2015-04-14 Thread Till Rohrmann
If your inner iterations happens to work only on the data of a single partition, then you can also implement this iteration as part of a mapPartition operator. The only problem there would be that you have to keep all the partition's data on the heap, if you need access to it. Cheers, Till On

Re: flink ml - k-means

2015-04-27 Thread Till Rohrmann
Hi Paul, if you can't wait, a vanilla implementation is already contained as part of the Flink examples. You should find it under flink/flink-examples. But we will try to add more clustering algorithms in the near future. Cheers, Till On Apr 26, 2015 11:14 PM, Alexander Alexandrov

Re: Left outer join

2015-04-16 Thread Till Rohrmann
You can materialize the input of the right input by creating an array out of it, for example. Then you can reiterate over it. Cheers, Till On Apr 16, 2015 7:37 PM, Flavio Pompermaier pomperma...@okkam.it wrote: Hi Maximilian, I tried your solution but it doesn't work because the rightElements

Re: Left outer join

2015-04-17 Thread Till Rohrmann
like to have a unique dataset D3(Tuple4) like A,X,a1,a2 B,Y,b1,null Basically filling with D1.f0,D2.f0,D1.f2(D1.f1==p1),D1.f2(if D1.f1==p2) when D1.f2==D2.f0. Is that possible and how? Could you show me a simple snippet? Thanks in advance, Flavio On Thu, Apr 16, 2015 at 9:48 PM, Till

Re: Left outer join

2015-04-17 Thread Till Rohrmann
to accumulate stuff in a local variable is it safe if datasets are huge..? On Fri, Apr 17, 2015 at 9:54 AM, Till Rohrmann till.rohrm...@gmail.com wrote: If it's fine when you have null string values in the cases where D1.f1!=a1 or D1.f2!=a2 then a possible solution could look like (with Scala

Re: Left outer join

2015-04-17 Thread Till Rohrmann
value of D1.f0 you can have at most one value of a1 and a2) Is it more clear? On Fri, Apr 17, 2015 at 9:03 AM, Till Rohrmann till.rohrm...@gmail.com wrote: Hi Flavio, I don't really understand what you try to do. What does D1.f2(D1.f1==p1) mean? What does happen if the condition in D1.f2

Re: Fwd: External Talk: Apache Flink - Speakers: Kostas Tzoumas (CEO dataArtisans), Stephan Ewen (CTO dataArtisans)

2015-04-09 Thread Till Rohrmann
. Looking for papers on similarity search might help. -s On 07.04.2015 15:19, Till Rohrmann wrote: I don't know whether my ideas are much better than the cartesian product solution. As a matter of fact at some point we have to replicate the data to be able to compute the correlations

Re: Fwd: External Talk: Apache Flink - Speakers: Kostas Tzoumas (CEO dataArtisans), Stephan Ewen (CTO dataArtisans)

2015-04-07 Thread Till Rohrmann
I don't know whether my ideas are much better than the cartesian product solution. As a matter of fact at some point we have to replicate the data to be able to compute the correlations in parallel. There are basically 3 ideas I had: 1. Broadcast U and V and simply compute the correlation for

Re: Flink meetup group in Stockholm

2015-04-08 Thread Till Rohrmann
Really cool :-) On Wed, Apr 8, 2015 at 5:09 PM, Maximilian Michels m...@apache.org wrote: Love the purple. Have fun! :) On Wed, Apr 8, 2015 at 5:05 PM, Henry Saputra henry.sapu...@gmail.com wrote: Nice, congrats! On Wed, Apr 8, 2015 at 7:39 AM, Gyula Fóra gyf...@apache.org wrote: Hey

Re: k means - waiting for dataset

2015-05-21 Thread Till Rohrmann
Hi Paul, could you share your code with us so that we see whether there is any error. Does this error also occurs with 0.9-SNAPSHOT? Cheers, Till Che On Thu, May 21, 2015 at 11:11 AM, Pa Rö paul.roewer1...@googlemail.com wrote: hi flink community, i have implement k-means for clustering

Re: flink k-means on hadoop cluster

2015-06-08 Thread Till Rohrmann
I assume that the path inputs and outputs is not correct since you get the error message *chown `output’: No such file or directory*. Try to provide the full path to the chown command such as hdfs://ServerURI/path/to/your/directory. ​ On Mon, Jun 8, 2015 at 11:23 AM Pa Rö

Re: flink k-means on hadoop cluster

2015-06-08 Thread Till Rohrmann
:///127.0.0.1:8020/user/cloudera/inputs/ how i must set the path to hdfs?? 2015-06-08 11:38 GMT+02:00 Till Rohrmann till.rohrm...@gmail.com: I assume that the path inputs and outputs is not correct since you get the error message *chown `output’: No such file or directory*. Try to provide

Re: Flink 0.9 built with Scala 2.11

2015-06-10 Thread Till Rohrmann
No there are no Scala 2.11 Flink binaries which you can download. You have to build it yourself. Cheers, Till On Wed, Jun 10, 2015 at 3:19 PM Philipp Goetze philipp.goe...@tu-ilmenau.de wrote: Thank you Chiwan! I did not know the master has a 2.11 profile. But there is no pre-built Flink

Re: Apache Flink transactions

2015-06-04 Thread Till Rohrmann
But what you can do to simulate an insert is to read the new data in a separate DataSet and then apply an union operator on the new and old DataSet . Cheers, Till ​ On Thu, Jun 4, 2015 at 9:00 AM, Chiwan Park chiwanp...@icloud.com wrote: Hi. Flink is not DBMS. There is no equivalent

Re: Log messages - redirect

2015-06-19 Thread Till Rohrmann
If I’m not mistaken from the shown output, you’re talking about the stdout output of the client, right? This output is not controlled by the log4j.properties or logback.xml file. However, you can use any command line tool available on your platform to redirect the stdout. For example on a Linux

Re: Logging in Flink 0.9.0-milestone-1

2015-06-26 Thread Till Rohrmann
Hi Stefan, You can do this if you disableSysoutLogging and change your log4j-cli.properties file to also print to console. There you can then control what is logged to console. However, I think that you have to set disableSysoutLogging in your program. Cheers, Till ​ On Fri, Jun 26, 2015 at

Re: Flink-ML as Dependency

2015-06-11 Thread Till Rohrmann
Hi Max, I just tested a build using gradle (with your build.gradle file) and some flink-ml algorithms. And it was completed without the problem of the unresolved breeze dependency. I use the version 2.2.1 of Gradle. Which version are you using? Since you’re using Flink’s snapshots and have

Re: Flink-ML as Dependency

2015-06-10 Thread Till Rohrmann
Hi Max, I think the reason is that the flink-ml pom contains as a dependency an artifact with artifactId breeze_${scala.binary.version}. The variable scala.binary.version is defined in the parent pom and not substituted when flink-ml is installed. Therefore gradle tries to find a dependency with

Re: Flink-ML as Dependency

2015-06-11 Thread Till Rohrmann
On Thu, Jun 11, 2015 at 11:22 AM, Till Rohrmann trohrm...@apache.org wrote: Hi Max, I just tested a build using gradle (with your build.gradle file) and some flink-ml algorithms. And it was completed without the problem of the unresolved breeze dependency. I use the version 2.2.1 of Gradle

Re: Monitoring memory usage of a Flink Job

2015-06-15 Thread Till Rohrmann
Hi Tamara, you can instruct Flink to write the current memory statistics to the log by setting taskmanager.debug.memory.startLogThread: true in the Flink configuration. Furthermore, you can control the logging interval with taskmanager.debug.memory.logIntervalMs where the interval is specified in

Re: Random Selection

2015-06-15 Thread Till Rohrmann
Hi Max, the problem is that you’re trying to serialize the companion object of scala.util.Random. Try to create an instance of the scala.util.Random class and use this instance within your RIchFilterFunction to generate the random numbers. Cheers, Till On Mon, Jun 15, 2015 at 1:56 PM Maximilian

Re: Random Shuffling

2015-06-15 Thread Till Rohrmann
Hi Max, you can always shuffle your elements using the rebalance method. What Flink here does is to distribute the elements of each partition among all available TaskManagers. This happens in a round-robin fashion and is thus not completely random. A different mean is the partitionCustom method

Re: Choosing random element

2015-06-16 Thread Till Rohrmann
This might help you [1]. Cheers, Till [1] http://stackoverflow.com/questions/2514061/how-to-pick-random-small-data-samples-using-map-reduce On Tue, Jun 16, 2015 at 10:16 AM Maximilian Alber alber.maximil...@gmail.com wrote: Hi Flinksters, again a similar problem. I would like to choose ONE

Re: Help with Flink experimental Table API

2015-06-11 Thread Till Rohrmann
Hi Shiti, here is the issue [1]. Cheers, Till [1] https://issues.apache.org/jira/browse/FLINK-2203 On Thu, Jun 11, 2015 at 8:42 AM Shiti Saxena ssaxena@gmail.com wrote: Hi Aljoscha, Could you please point me to the JIRA tickets? If you could provide some guidance on how to resolve

Re: Flink 0.9 built with Scala 2.11

2015-06-15 Thread Till Rohrmann
+1 for giving only those modules a version suffix which depend on Scala. On Sun, Jun 14, 2015 at 8:03 PM Robert Metzger rmetz...@apache.org wrote: There was already a discussion regarding the two options here [1], back then we had a majority for giving all modules a scala suffix. I'm against

Re: Building master branch is failed

2015-05-29 Thread Till Rohrmann
Yes, this is another error. Seems to be related to the new scala shell. On Fri, May 29, 2015 at 11:00 AM, Chiwan Park chiwanp...@icloud.com wrote: I fetched master branch and ran again. But I got the same error. It seems that the problem is related to javadoc. Till’s fix is related to

Re: OutOfMemoryException: unable to create native thread

2015-07-01 Thread Till Rohrmann
Hi Chan, if you feel up to implementing such an input format, then you can also contribute it. You simply have to open a JIRA issue and take ownership of it. Cheers, Till On Wed, Jul 1, 2015 at 10:08 AM, chan fentes chanfen...@gmail.com wrote: Thank you all for your help and for pointing out

Re: time measured for each iteration in KMeans

2015-07-01 Thread Till Rohrmann
Do you also have the rest of the code. It would be helpful in order to find out why it's not working. Cheers, TIll On Wed, Jul 1, 2015 at 1:31 PM, Pa Rö paul.roewer1...@googlemail.com wrote: now i have implement a time logger in the open and close methods, it is wrok fine, but i try to

Re: The slot in which the task was scheduled has been killed (probably loss of TaskManager)

2015-06-30 Thread Till Rohrmann
Do you have the JobManager and TaskManager logs of the corresponding TM, by any chance? On Mon, Jun 29, 2015 at 8:12 PM, Andra Lungu lungu.an...@gmail.com wrote: Something similar in flink-0.10-SNAPSHOT: 06/29/2015 10:33:46 CHAIN Join(Join at main(TriangleCount.java:79)) - Combine

Re: writeAsCsv not writing anything on HDFS when WriteMode set to OVERWRITE

2015-06-30 Thread Till Rohrmann
Hi Mihail, have you checked that the DataSet you want to write to HDFS actually contains data elements? You can try calling collect which retrieves the data to your client to see what’s in there. Cheers, Till ​ On Tue, Jun 30, 2015 at 12:01 PM, Mihail Vieru vi...@informatik.hu-berlin.de wrote:

Re: k means - waiting for dataset

2015-05-21 Thread Till Rohrmann
Concerning your first problem that you only see one resulting centroid, your code looks good modulo the parts you haven't posted. However, your problem could simply be caused by a bad selection of initial centroids. If, for example, all centroids except for one don't get any points assigned, then

Re: Write File on specific machine

2015-05-22 Thread Till Rohrmann
Hi Hilmi, in Flink you cannot control on which machines the individual tasks are scheduled. Therefore, you cannot control on which machine the data is written. However, you can control that only on one machine a file is created by setting the degree of parallelism of the DataSink to *1*. Cheers,

Re: SLIDES: Overview of Apache Flink: Next-Gen Big Data Analytics Framework

2015-07-07 Thread Till Rohrmann
Nice, thanks for sharing Slim. Cheers, Till On Tue, Jul 7, 2015 at 6:19 AM, Slim Baltagi sbalt...@gmail.com wrote: Hi This is the link *http://goo.gl/gVOSp8* to the slides of my talk on June 30, 2015 at the Chicago Apache Flink meetup. Although most of the current buzz is about Apache

Re: Tuple model project

2015-07-30 Thread Till Rohrmann
at 1:41 PM, Till Rohrmann trohrm...@apache.org wrote: Hi Flavio, in order to use the Kryo serializer for a given type you can use the registerTypeWithKryoSerializer of the ExecutionEnvironment object. What you provide to the method is the type you want to be serialized with kryo

Re: Tuple model project

2015-07-30 Thread Till Rohrmann
Hi Flavio, in order to use the Kryo serializer for a given type you can use the registerTypeWithKryoSerializer of the ExecutionEnvironment object. What you provide to the method is the type you want to be serialized with kryo and an implementation of the com.esotericsoftware.kryo.Serializer

Re: java.lang.ClassNotFoundException when deploying streaming jar locally

2015-08-06 Thread Till Rohrmann
Hi Michael, in the flink-test-0.1.jar the class DaoJoin$1.class is located at com/davengo/rfidcloud/flink but Flink tries to load com.otter.ist.flink.DaoJoin$1. This might be the problem. This is somehow odd because in the source code you’ve specified the correct package com.otter.ist.flink.

Re: Pass not serializable objects to Flink transformation functions

2015-07-27 Thread Till Rohrmann
Hi Flavio, for the user code logic Flink uses exclusively Java serialization. What you can do, though, is to override the readObject and writeObject methods which are used by Java serialization. Within the methods you can serialize the other object you’re referencing. Cheers, Till ​ On Mon, Jul

Re: filter as termination condition

2015-07-22 Thread Till Rohrmann
Sachin is right that the filter has to be inverted. Furthermore, the join operation is not right here. You have to do a kind of a left outer join where you only keep the elements which join with NULL. Here is an example of how one could do it [1]. Cheers, Till [1]

Re: Scala: registerAggregationConvergenceCriterion

2015-07-17 Thread Till Rohrmann
Hi Max, I’d recommend you to use the DataSet[T].iterateWithTermination method instead. It has the following syntax: iterationWithTermination(maxIterations: Int)(stepFunction: (DataSet[T] = (DataSet[T], DataSet[_])): DataSet[T] There you see that your step function has to return a tuple of data

Re: UI for flink

2015-07-13 Thread Till Rohrmann
Hi Hermann, when you start a Flink cluster, then also the web interface is started. It is reachable under http://jobManagerURL:8081. The web interface tells you a lot about the current state of your cluster and the currently executed Flink jobs. Additionally, you can start the web client via

Re: flink on yarn configuration

2015-07-14 Thread Till Rohrmann
Hi Paul, when you run your Flink cluster with YARN then we cannot give the full amount of the allocated container memory to Flink. The reason is that YARN itself needs some of the memory as well. Since YARN is quite strict with containers which exceed their memory limit (the container is

Re: Submitting jobs from within Scala code

2015-07-16 Thread Till Rohrmann
to note, that both Flink and our project is built with Scala 2.11. Best Regards, Philipp On 16.07.2015 11:12, Till Rohrmann wrote: Hi Philipp, could you post the complete log output. This might help to get to the bottom of the problem. Cheers, Till On Thu, Jul 16, 2015 at 11:01 AM

Re: Submitting jobs from within Scala code

2015-07-16 Thread Till Rohrmann
Hi Philipp, could you post the complete log output. This might help to get to the bottom of the problem. Cheers, Till On Thu, Jul 16, 2015 at 11:01 AM, Philipp Goetze philipp.goe...@tu-ilmenau.de wrote: Hi community, in our project we try to submit built Flink programs to the jobmanager

Re: Submitting jobs from within Scala code

2015-07-16 Thread Till Rohrmann
On 16.07.2015 11:45, Till Rohrmann wrote: When you run your program from the IDE, then you can specify a log4j.properties file. There you can configure where and what to log. It should be enough to place the log4j.properties file in the resource folder of your project. An example properties file

Re: Re: Submitting jobs from within Scala code

2015-07-16 Thread Till Rohrmann
Hi Philipp, what I usually do to run a Flink program on a cluster from within my IDE, I create a RemoteExecutionEnvironment. Since I have one UDF (the map function which doubles the values) defined, I also need to specify the jar containing this class. In my case, the jar is called

Re: Submitting jobs from within Scala code

2015-07-16 Thread Till Rohrmann
-JAR from the locally running Flink version. So to say I used an older Snapshot version for compiling than for running :-[ Best Regards, Philipp On 16.07.2015 17:35, Till Rohrmann wrote: Hi Philipp, what I usually do to run a Flink program on a cluster from within my IDE, I create

Re: Flink Kafka example in Scala

2015-07-17 Thread Till Rohrmann
These two links [1, 2] might help to get your job running. The first link describes how to set up a job using Flink's machine learning library, but it works also for the flink-connector-kafka library. Cheers, Till [1] http://stackoverflow.com/a/31455068/4815083 [2]

Re: Submitting jobs from within Scala code

2015-07-17 Thread Till Rohrmann
a version number into the messages (at least between client and JobManager) and fail fast on version mismatches... On Thu, Jul 16, 2015 at 5:56 PM, Till Rohrmann trohrm...@apache.org wrote: Good to hear that your problem is solved :-) Cheers, Till On Thu, Jul 16, 2015 at 5:45 PM, Philipp

Re: Flink deadLetters

2015-07-17 Thread Till Rohrmann
That is usually nothing to worry about. This just means that the message was sent without specifying a sender. What Akka then does is to use the `/deadLetters` actor as the sender. What kind of job is it? Cheers, Till On Fri, Jul 17, 2015 at 6:30 PM, Flavio Pompermaier pomperma...@okkam.it

Re: Flink Kafka example in Scala

2015-07-20 Thread Till Rohrmann
Hi Wendong, why do you exclude the kafka dependency from the `flink-connector-kafka`? Do you want to use your own kafka version? I'd recommend you to build a fat jar instead of trying to put the right dependencies in `/lib`. Here [1] you can see how to build a fat jar with sbt. Cheers, Till

Re: Too few memory segments provided exception

2015-07-20 Thread Till Rohrmann
The taskmanager.memory.fraction you can also set from within the IDE by giving the corresponding configuration object to the LocalEnvironment using the setConfiguration method. However, the taskmanager.heap.mb is basically the -Xmx value with which you start your JVM. Usually, you can set this in

Re: sorted cogroup

2015-07-21 Thread Till Rohrmann
Hi Michele, Flink supports coGroups on sorted inputs. If you have a ds1 = DataSet[(Key, Value1)] and ds2 = DataSet[(Key, Value2)] you obtain a sorted coGroup for example by: ds1.coGroup(ds2).where(0).equalsTo(0).sortFirstGroup(1, Order.ASCENDING).sortSecondGroup(1, Order.DESCENDING) Cheers,

Re: Flink Kafka example in Scala

2015-07-21 Thread Till Rohrmann
Glad to hear that it finally worked :-) On Tue, Jul 21, 2015 at 2:21 AM, Wendong wendong@gmail.com wrote: Hi Till, Thanks for your suggestion! I did a fat jar and the runtime error of ClassNotFoundException was finally gone. I wish I had tried fat jar earlier and it would have saved me

Re: Flink Data Stream Union

2015-10-21 Thread Till Rohrmann
Can it be that you forgot to call unionMessageStreams in your main method? Cheers, Till ​ On Wed, Oct 21, 2015 at 3:02 PM, flinkuser wrote: > Here is the strange behavior. > > Below code works in one box but not in the other. I had it working in my > laptop the whole of

Re: Zeppelin Integration

2015-10-21 Thread Till Rohrmann
Hi Trevor, in order to use Zeppelin with a different Flink version in local mode, meaning that Zeppelin starts a LocalFlinkMiniCluster when executing your jobs, you have to build Zeppelin and change the flink.version property in the zeppelin/flink/pom.xml file to the version you want to use. If

Re: Reading multiple datasets with one read operation

2015-10-22 Thread Till Rohrmann
I fear that the filter operations are not chained because there are at least two of them which have the same DataSet as input. However, it's true that the intermediate results are not materialized. It is also correct that the filter operators are deployed colocated to the data sources. Thus,

Re: Multiple keys in reduceGroup ?

2015-10-22 Thread Till Rohrmann
If not, could you provide us with the program and test data to reproduce the error? Cheers, Till On Thu, Oct 22, 2015 at 12:34 PM, Aljoscha Krettek wrote: > Hi, > but he’s comparing it to a primitive long, so shouldn’t the Long key be > unboxed and the comparison still be

Re: Multiple keys in reduceGroup ?

2015-10-22 Thread Till Rohrmann
nce it get the right data. > > > > Since I never modify my objects, why object reuse isn’t working ? > > > > Best regards, > > Arnaud > > > > > > *De :* Till Rohrmann [mailto:trohrm...@apache.org] > *Envoyé :* jeudi 22 octobre 2015 12:36 > *À :

Re: Zeppelin Integration

2015-11-04 Thread Till Rohrmann
pache-casserole-a-delicious-big-data-recipe-for-the-whole-family/ > Thanks Trevor for the great tutorial! > > On Thu, Oct 22, 2015 at 4:23 PM, Till Rohrmann <trohrm...@apache.org> > wrote: > >> Hi Trevor, >> >> that’s actually my bad since I only tested my b

Re: Published test artifacts for flink streaming

2015-11-06 Thread Till Rohrmann
ess Table API's > select and where operators from within such a flatMap? > > -n > > On Fri, Nov 6, 2015 at 6:19 AM, Till Rohrmann <trohrm...@apache.org> > wrote: > > Hi Nick, > > > > I think a flatMap operation which is instantiated with your list of >

Re: Published test artifacts for flink streaming

2015-11-06 Thread Till Rohrmann
Hi Nick, I think a flatMap operation which is instantiated with your list of predicates should do the job. Thus, there shouldn’t be a need to dig deeper than the DataStream for the first version. Cheers, Till ​ On Fri, Nov 6, 2015 at 3:58 AM, Nick Dimiduk wrote: > Thanks

Re: data sink stops method

2015-10-15 Thread Till Rohrmann
Could you post a minimal example of your code where the problem is reproducible? I assume that there has to be another problem because env.execute should actually trigger the execution. Cheers, Till ​ On Thu, Oct 8, 2015 at 8:58 PM, Florian Heyl wrote: > Hey Stephan and Pieter,

Re: reduce error

2015-10-20 Thread Till Rohrmann
Hi Michele, I will look into the problem. As Ufuk said, it would be really helpful, if you could provide us with the data set. If it's problematic to share the data via the mailing list, then you could also send me the data privately. Thanks a lot for your help. Cheers, Till On Fri, Oct 16,

Re: kryo exception due to race condition

2015-10-06 Thread Till Rohrmann
Hi Stefano, we'll definitely look into it once Flink Forward is over and we've finished the current release work. Thanks for reporting the issue. Cheers, Till On Tue, Oct 6, 2015 at 9:21 AM, Stefano Bortoli wrote: > Hi guys, I could manage to complete the process crossing

Re: YARN High Availability

2015-11-18 Thread Till Rohrmann
Hi Gwenhaël, do you have access to the yarn logs? Cheers, Till On Wed, Nov 18, 2015 at 5:55 PM, Gwenhael Pasquiers < gwenhael.pasqui...@ericsson.com> wrote: > Hello, > > > > We’re trying to set up high availability using an existing zookeeper > quorum already running in our Cloudera cluster. >

Re: Multiple restarts of Local Cluster

2015-09-02 Thread Till Rohrmann
xecution, while in local mode. > > -- Sachin Goel > Computer Science, IIT Delhi > m. +91-9871457685 > > On Wed, Sep 2, 2015 at 9:19 PM, Till Rohrmann <trohrm...@apache.org> > wrote: > >> Why is it not possible to shut down the local cluster? Can’t you shut it >&

Re: Multiple restarts of Local Cluster

2015-09-02 Thread Till Rohrmann
Why is it not possible to shut down the local cluster? Can’t you shut it down in the @AfterClass method? ​ On Wed, Sep 2, 2015 at 4:56 PM, Sachin Goel wrote: > Yes. That will work too. However, then it isn't possible to shut down the > local cluster. [Is it necessary

Re: Multiple restarts of Local Cluster

2015-09-02 Thread Till Rohrmann
. It might matter for the test execution where maven reuses the JVMs and where the LocalFlinkMiniCluster won’t be garbage collected right away. You could try it out and see what happens. Cheers, Till ​ On Wed, Sep 2, 2015 at 6:03 PM, Till Rohrmann <trohrm...@apache.org> wrote: > Oh sorr

Re: Multiple restarts of Local Cluster

2015-09-02 Thread Till Rohrmann
mTestBase or > JavaProgramTestBase​. Those shut down the cluster explicitly anyway. > I will make sure if this is the case. > > Regards > Sachin > > -- Sachin Goel > Computer Science, IIT Delhi > m. +91-9871457685 > > On Wed, Sep 2, 2015 at 9:40 PM, Till Rohrmann <trohrm...@

Re: Broadcasting sets in Flink Streaming

2015-08-25 Thread Till Rohrmann
Hi Tamara, I think this is not officially supported by Flink yet. However, I think that Gyula had once an example where he did something comparable. Maybe he can chime in here. Cheers, Till On Tue, Aug 25, 2015 at 11:15 AM, Tamara Mendt tammyme...@gmail.com wrote: Hello, I have been trying

Re: Flink HA mode

2015-09-09 Thread Till Rohrmann
The only necessary information for the JobManager and TaskManager is to know where to find the ZooKeeper quorum to do leader election and retrieve the leader address from. This will be configured via the config parameter `ha.zookeeper.quorum`. On Wed, Sep 9, 2015 at 10:15 AM, Stephan Ewen

Re: Issue with parallelism when CoGrouping custom nested data tpye

2015-09-16 Thread Till Rohrmann
Hi Pieter, your code doesn't look suspicious at the first glance. Would it be possible for you to post a complete example with data (also possible to include it in the code) to reproduce your problem? Cheers, Till On Wed, Sep 16, 2015 at 10:31 AM, Pieter Hameete wrote: >

Re: data flow example on cluster

2015-09-30 Thread Till Rohrmann
It's described here: https://ci.apache.org/projects/flink/flink-docs-release-0.9/quickstart/setup_quickstart.html#run-example Cheers, Till On Wed, Sep 30, 2015 at 8:24 AM, Lydia Ickler wrote: > Hi all, > > I want to run the data-flow Wordcount example on a Flink

Re: Flink program compiled with Janino fails

2015-10-05 Thread Till Rohrmann
I’m not a Janino expert but it might be related to the fact that Janino not fully supports generic types (see http://unkrig.de/w/Janino under limitations). Maybe it works of you use the untyped MapFunction type. Cheers, Till ​ On Sat, Oct 3, 2015 at 8:04 PM, Giacomo Licari

Re: data flow example on cluster

2015-10-02 Thread Till Rohrmann
Hi Lydia, I think the APIs of the versions 0.8-incubating-SNAPSHOT and 0.10-SNAPSHOT are not compatible. Thus, it’s not just simply setting the dependencies to 0.10-SNAPSHOT. You also have to fix the API changes. This might not be trivial. Therefore, I’d recommend you to simply use the ALS

Re: Using Kryo for serializing lambda closures

2015-12-08 Thread Till Rohrmann
Hi Nick, at the moment Flink uses Java serialization to ship the UDFs to the cluster. Therefore, the closures must only contain Serializable objects. The serializer registration only applies to the data which is processed by the Flink job. Thus, for the moment I would try to get rid of the

Re: Including option for starting job and task managers in the foreground

2015-12-02 Thread Till Rohrmann
Hi Brian, as far as I know this is at the moment not possible with our scripts. However it should be relatively easy to add by simply executing the Java command in flink-daemon.sh in the foreground. Do you want to add this? Cheers, Till On Dec 1, 2015 9:40 PM, "Brian Chhun"

Re: Question about flink message processing guarantee

2015-12-02 Thread Till Rohrmann
Just a small addition. Your sources have to be replayable to some extent. With replayable I mean that they can continue from some kind of offset. Otherwise the check pointing won't help you. The Kafka source supports that for example. Cheers, Till On Dec 1, 2015 11:55 PM, "Márton Balassi"

Re: HA Mode and standalone containers compatibility ?

2015-12-03 Thread Till Rohrmann
Hi Arnaud, as long as you don't have HA activated for your batch jobs, HA shouldn't have an influence on the batch execution. If it interferes, then you should see additional task manager connected to the streaming cluster when you execute the batch job. Could you check that? Furthermore, could

Re: Using memory logging in Flink

2015-12-09 Thread Till Rohrmann
I assume you're looking in the taskmanager log file for the memory usage logging statements, right? Cheers, Till On Wed, Dec 9, 2015 at 12:15 AM, Filip Łęczycki wrote: > Hi, > > Thank you for your reply! > > I have made sure I restarted the TaskManager after changing

Re: Problems with using ZipWithIndex

2015-12-14 Thread Till Rohrmann
I just tested the zipWithIndex method with Flink 0.10.1 and it worked. I used the following code: import org.apache.flink.api.scala._ import org.apache.flink.api.scala.utils._ object Job { def main(args: Array[String]): Unit = { val env = ExecutionEnvironment.getExecutionEnvironment

Re: Problem to show logs in task managers

2015-12-17 Thread Till Rohrmann
Hi Ana, you can simply modify the `log4j.properties` file in the `conf` directory. It should be automatically included in the Yarn application. Concerning your logging problem, it might be that you have set the logging level too high. Could you share the code with us? Cheers, Till On Thu, Dec

Re: global watermark across multiple kafka consumers

2015-12-16 Thread Till Rohrmann
Hi Andrew, as far as I know, there is nothing such as a prescribed way of handling this kind of situation. If you want to synchronize the watermark generation given a set of KafkaConsumers you need some kind of ground truth. This could be, for example, a central registry such as ZooKeeper in

Re: Scala API and sources with timestamp

2016-01-04 Thread Till Rohrmann
Hi Don, yes that's exactly how you use an anonymous function as a source function. Cheers, Till On Tue, Dec 22, 2015 at 3:16 PM, Don Frascuchon wrote: > Hello, > > There is a way for define a EventTimeSourceFunction with anonymous > functions from the scala api? Like

Re: Problem to show logs in task managers

2016-01-04 Thread Till Rohrmann
on-enable property well or I am not restarting the Flink > JobManager and TaskManagers as I should… Any idea? > > Thanks, > Ana > > On 18 Dec 2015, at 16:29, Till Rohrmann <trohrm...@apache.org> wrote: > > In which log file are you exactly looking for the logging statements?

Re: kafka integration issue

2016-01-05 Thread Till Rohrmann
Hi Alex, this is a bug in the `0.10` release. Is it possible for you to switch to version `1.0-SNAPSHOT`. With this version, the error should no longer occur. Cheers, Till On Tue, Jan 5, 2016 at 1:31 AM, Alex Rovner wrote: > Hello Flinkers! > > The below program

Re: flink kafka scala error

2016-01-06 Thread Till Rohrmann
Hi Madhukar, could you check whether your Flink installation contains the flink-dist-0.10.1.jar in the lib folder? This file contains the necessary scala-library.jar which you are missing. You can also remove the line org.scala-lang:scala-library which excludes the scala-library dependency to be

Re: Problem to show logs in task managers

2015-12-18 Thread Till Rohrmann
lt;String, Integer>(word, 1)); > } > } > } > > public static class TestClass implements Serializable { > private static final long serialVersionUID = -2932037991574118651L; > > static Logger loggerTestClass = > LoggerFactory.g

Re: Configure log4j with XML files

2015-12-21 Thread Till Rohrmann
Hi Gwenhaël, as far as I know, there is no direct way to do so. You can either adapt the flink-daemon.sh script in line 68 to use a different configuration or you can test whether the dynamic property -Dlog4j.configurationFile:CONFIG_FILE overrides the -Dlog4j.confguration property. You can set

Re: Standalone Cluster vs YARN

2015-11-25 Thread Till Rohrmann
Hi Welly, at the moment Flink only supports HA via ZooKeeper. However, there is no limitation to use another system. The only requirement is that this system allows you to find a consensus among multiple participants and to retrieve the community decision. If this is possible, then it can be

Re: key

2015-11-30 Thread Till Rohrmann
Hi Radu, if you want to use custom types as keys, then these custom types have to implement the Key interface. Cheers, Till ​ On Mon, Nov 30, 2015 at 5:28 PM, Radu Tudoran wrote: > Hi, > > > > I want to apply a “keyBy operator on a stream”. > > The string is of type

Re: Working with protobuf wrappers

2015-12-01 Thread Till Rohrmann
Hi Kryzsztof, it's true that we once added the Protobuf serializer automatically. However, due to versioning conflicts (see https://issues.apache.org/jira/browse/FLINK-1635), we removed it again. Now you have to register the ProtobufSerializer manually:

Re: YARN High Availability

2015-11-18 Thread Till Rohrmann
nother question : if I have multiple HA flink jobs, are there some points > to check in order to be sure that they won’t collide on hdfs or ZK ? > > > > B.R. > > > > Gwenhaël PASQUIERS > > > > *From:* Till Rohrmann [mailto:till.rohrm...@gmail.com] > *Sent:* merc

Re: Compiler Exception

2015-11-19 Thread Till Rohrmann
Hi Kien Truong, could you share the problematic code with us? Cheers, Till On Nov 18, 2015 9:54 PM, "Truong Duc Kien" wrote: > Hi, > > I'm hitting Compiler Exception with some of my data set, but not all of > them. > > Exception in thread "main"

Re: Compiler Exception

2015-11-20 Thread Till Rohrmann
d: Iterator[Edge[Long, NullValue]], >collector: Collector[Vertex[Long, Long]]) => { > if (first.hasNext) { > collector.collect(first.next) > } > } > } > } > } > println(data.collect()) > } &

Re: YARN High Availability

2015-11-19 Thread Till Rohrmann
paths unique. > > On Thu, Nov 19, 2015, 11:24 Till Rohrmann <trohrm...@apache.org> wrote: > >> I agree that this would make the configuration easier. However, it >> entails also that the user has to retrieve the randomized path from the >> logs if he wants

  1   2   3   4   5   6   >