Re: Maintaining data locality with list of paths (strings) as input

2015-03-15 Thread Robert Metzger
Hi, @Emmanuel: Is the Flink behavior mentioned native or is this something happening when running Flink on YARN? The input split assignment behavior Stephan described is implemented into Flink, so it works in a stanalone Flink cluster and in a YARN setup. In a setup where each machine running a

Re: Flink cluster dev environment in Docker

2015-03-17 Thread Robert Metzger
Hey Emmanuel, thank you for this great contribution. I'm going to test the docker deployment soon. I would actually like to include the files into the Flink source repository. Either into the flink-contrib module, or into the tools directory. Whats the take of the other committers on this? On

Re: GSoC project proposal: Query optimisation layer for Flink Streaming

2015-03-24 Thread Robert Metzger
Just a quick ping on this for the streaming folks: The deadline for the proposal submissions is Friday, so the GSoC applicants need to get our feedback asap. The student asked me today in the #flink channel whether we can review this proposal. I have the following comments regarding the

Re: Flink cluster dev environment in Docker

2015-03-24 Thread Robert Metzger
would like to keep the files in your repository, we can also link to it. On Fri, Mar 20, 2015 at 6:05 AM, Henry Saputra henry.sapu...@gmail.com wrote: +1 for the idea. I cross post this to dev@ list for FYI - Henry On Tue, Mar 17, 2015 at 2:54 AM, Robert Metzger rmetz...@apache.org wrote

Re: Gelly available already?

2015-03-23 Thread Robert Metzger
Hi, Gelly is not part of any offical flink release. You have to use a Snapshot version of Flink if you want to try it out. Sent from my iPhone On 23.03.2015, at 23:10, Andra Lungu lungu.an...@gmail.com wrote: Hi Sebastian, For me it works just as described there, with 0.9, but there

Re: CoGgroup Operator Data Sink

2015-04-14 Thread Robert Metzger
Hi, you can write the output of a coGroup operator to two sinks: --\ /Sink1 \ / (CoGroup) /\ --/ \--Sink2 You can actually write to as many sinks as you want. Note that the data written to Sink1 and Sink2 will be

Re: flink ml k means relase

2015-05-11 Thread Robert Metzger
Hi, the community didn't decide on a plan for releasing Flink 0.9 yet. Here, you can track the progress for the Flink ML variant of KMeans: https://issues.apache.org/jira/browse/FLINK-1731 There is also a KMeans implementation in the examples of Flink. Maybe that is sufficient for now? --Robert

Re: Error when use tow datasink

2015-05-10 Thread Robert Metzger
Hi, which version of Flink are you using? Since 0.8.0, Flink supports disjoint dataflows ( https://issues.apache.org/jira/browse/FLINK-820). On Sun, May 10, 2015 at 6:11 PM, hagersaleh loveallah1...@yahoo.com wrote: Exception in thread main org.apache.flink.compiler.CompilerException: The

Re: Error when use tow datasink

2015-05-10 Thread Robert Metzger
Can you upgrade to Flink 0.8.0 ? On Sun, May 10, 2015 at 8:47 PM, hagersaleh loveallah1...@yahoo.com wrote: use flink 7.0 -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Error-when-use-tow-datasink-tp1205p1209.html Sent from the

Re: want write Specified field from dataset to output file

2015-05-12 Thread Robert Metzger
Hi, please have a look at the programming guide on how to write results to an output file: http://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#data-sinks In your case, I would recommend: customers.writeAsFormattedText(file:///path/to/the/result/file, new

Re: KryoException with joda Datetime

2015-05-15 Thread Robert Metzger
Hi Flavio, which version of Flink are you using? If you are using 0.9 something, then this should actually work ;) On Fri, May 15, 2015 at 10:06 AM, Flavio Pompermaier pomperma...@okkam.it wrote: Hi to all, this morning I run my Flink job and I got the following exception serializing a

Re: KryoException with joda Datetime

2015-05-15 Thread Robert Metzger
with Flink 0.9-SNAPSHOT (i.e. 2.5) and before today I've never seen this error..als o because DateTime is Serializable :) On Fri, May 15, 2015 at 10:25 AM, Fabian Hueske fhue...@gmail.com wrote: Is there a chance that the version of JodaTime changed? 2015-05-15 10:22 GMT+02:00 Robert Metzger

Re: KryoException with joda Datetime

2015-05-15 Thread Robert Metzger
changed..) On Fri, May 15, 2015 at 10:09 AM, Robert Metzger rmetz...@apache.org wrote: Hi Flavio, which version of Flink are you using? If you are using 0.9 something, then this should actually work ;) On Fri, May 15, 2015 at 10:06 AM, Flavio Pompermaier pomperma...@okkam.it wrote: Hi

Re: KryoException with joda Datetime

2015-05-15 Thread Robert Metzger
...@okkam.it wrote: So do you think you could release a path soon? I need it to continue my work..otherwise if it's very simple you could send me the snippet of code to change my local flink version ;) Best, Flavio On Fri, May 15, 2015 at 11:22 AM, Robert Metzger rmetz...@apache.org wrote

Re: KryoException with joda Datetime

2015-05-15 Thread Robert Metzger
by distinct?? On Fri, May 15, 2015 at 2:38 PM, Robert Metzger rmetz...@apache.org wrote: Hi, the error means that you are grouping on a field which contains null values. We can not compare elements against null, that's why we throw the exception. Are you sure that you're not having any

Re: KryoException with joda Datetime

2015-05-15 Thread Robert Metzger
;) Its simply not supported, sorry. On Fri, May 15, 2015 at 2:57 PM, Flavio Pompermaier pomperma...@okkam.it wrote: Is it a bug or is it a feature? :) On Fri, May 15, 2015 at 2:56 PM, Robert Metzger rmetz...@apache.org wrote: I think distinct() is failing on null values because its using

Re: Logging in Flink 0.9.0-milestone-1

2015-04-14 Thread Robert Metzger
configuration for my own logging is working (even without the parameter), it's just that the file in the jar seems to be preferred over my file. Best, Stefan On 14 April 2015 at 17:16, Robert Metzger rmetz...@apache.org wrote: Hi Stefan, we made a stupid mistake in the 0.9.0-milestone-1 release

Re: Logging in Flink 0.9.0-milestone-1

2015-04-14 Thread Robert Metzger
Hi Stefan, we made a stupid mistake in the 0.9.0-milestone-1 release by including our log4j.properties into the flink-runtime jar. its also in the fat jar in flink-dist. Maybe you can pass the name of your log4j file to your application with - Dlog4j.configuration=log4j.xml? The issue is

Re: Orphaned chunks

2015-04-15 Thread Robert Metzger
Hey Flavio, I was not able to find the String Orphaned chunk in the Flink code base. However, I found it here: https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/memory/GlobalMemoryManager.java#L157 Maybe you've send the message to the wrong mailing list?

Re: Concurrency on fields within operators

2015-04-16 Thread Robert Metzger
Hi Flavio, tl;dr: they are thread safe. making a field transient means that it is ignored when serializing the class. This is for example useful when you have a non-serializable field in your function (you have to initialize it in the open() method then). So making it transient or not doesn't

Re: Adding log4j.properties file into src project

2015-04-17 Thread Robert Metzger
Hi, the log4j.properties file looks nice. The issue is that the resources folder is not marked as a source/resource folder, that's why Eclipse is not adding the file to the classpath. Have a look here:

Re: Exception with Java Table API example

2015-04-13 Thread Robert Metzger
Hi, the error looks like a mixup of the Scala versions. Are you adding any scala 2.11 dependencies in your pom? On Sun, Apr 12, 2015 at 9:35 PM, Mohamed Nadjib MAMI m...@iai.uni-bonn.de wrote: Hello, Suit to my previous email, apologize, the error was in: *Table table =

Re: Release date 0.9

2015-04-13 Thread Robert Metzger
Hi Sebastian, its great to hear that you're planning to do a seminar on data analytics platforms. We have closed the vote for a 0.9.0-milestone-1 release yesterday. I hope we are able to announce it today. Its a preview release only, but it contains all the latest features and its certainly

Re: how can Handling in flink this operation in sql bettween ,like , In

2015-04-12 Thread Robert Metzger
Hi, all these operations can be implemented in Flink using the filter() function. For example BETWEEN: DataSetProduct products = env.createFromElements(...); products..filter(new FilterFunctionProduct() { @Override public boolean filter(Product value) throws Exception { return

Re: How can generation dataset in flink automatic depend on number of filed and data type

2015-04-11 Thread Robert Metzger
Hey, do you want to read from a JDBC database? What exactly do you mean by automatic function? In general, I think you can read rows from a JDBC database and then use a map function to transform the rows from that database into a custom datatype (POJO). Best, Robert On Sat, Apr 11, 2015 at

Re: CSV header management

2015-04-17 Thread Robert Metzger
Yes, using a map() function that is transforming the input lines into the String array. On Fri, Apr 17, 2015 at 9:03 PM, Flavio Pompermaier pomperma...@okkam.it wrote: My csv has 32 columns,ia there a way to create a DatasetString[32]? On Apr 17, 2015 7:53 PM, Stephan Ewen se...@apache.org

Re: Logging in Flink 0.9.0-milestone-1

2015-04-14 Thread Robert Metzger
You can control the logging behavior from the ExecutionConfig (env.getExecutionConfig()). There is a method (disableSysoutLogging()) that you can use. (In 0.9-SNAPSHOT only). Sorry again that you ran into this issue. On Tue, Apr 14, 2015 at 8:45 PM, Robert Metzger rmetz...@apache.org wrote: Ah

Re: Logging in Flink 0.9.0-milestone-1

2015-04-14 Thread Robert Metzger
) [1] https://gist.github.com/knub/1c11683601b4eeb5d51b On 14 April 2015 at 18:47, Robert Metzger rmetz...@apache.org wrote: Hi, how are you starting Flink? Out of the IDE? Using the scripts? I just created a new flink project with the milestone version. Just putting your log4j.xml

Re: Write Stream to HBase

2015-05-20 Thread Robert Metzger
expecting an OutputFormat instead of a FileOutputFormat. Maybe its worth trying to see how far we can get with this method. On Wed, May 20, 2015 at 11:00 AM, Robert Metzger rmetz...@apache.org wrote: Hi, I agree with Hilmi, Flavio's examples are for batch. I'm not aware of a StreamingHBaseSink

Re: Write Stream to HBase

2015-05-20 Thread Robert Metzger
, Hilmi Am 20.05.2015 um 11:15 schrieb Robert Metzger: There is this pending pull request which is addressing exactly the issues I've mentioned (wrong naming, private method): https://github.com/apache/flink/pull/521 I'll see whats blocking the PR ... On Wed, May 20, 2015 at 11:11 AM

Re: when return value from linkedlist or map and use in filter function display error

2015-06-08 Thread Robert Metzger
What exactly is the error you are getting when using the non-static field? On Mon, Jun 8, 2015 at 2:41 PM, hagersaleh loveallah1...@yahoo.com wrote: when use non-static filed display error and filter function not show map -- View this message in context:

Re: start-scala-shell.sh

2015-06-06 Thread Robert Metzger
Hi Bill, the Scala Shell is a very recent contribution to our project. I have to admit that I didn't test it yet. But I'm also unable to find the script in the bin directory. There seems to be something wrong. I'll investigate the issue... On Sat, Jun 6, 2015 at 2:33 PM, Bill Sparks

Re: start-scala-shell.sh

2015-06-06 Thread Robert Metzger
, Robert Metzger rmetz...@apache.org wrote: Hi Bill, the Scala Shell is a very recent contribution to our project. I have to admit that I didn't test it yet. But I'm also unable to find the script in the bin directory. There seems to be something wrong. I'll investigate the issue... On Sat, Jun

Re: Reading from HBase problem

2015-06-08 Thread Robert Metzger
Hi Hilmi, if you just want to count the number of elements, you can also use accumulators, as described here [1]. They are much more lightweight. So you need to make your flatMap function a RichFlatMapFunction, then call getExecutionContext(). Use a long accumulator to count the elements. If

Re: count the k-means iteration

2015-06-03 Thread Robert Metzger
Sorry for the late reply. Operators part of an iteration allow you to get the current iteration id like this: getIterationRuntimeContext().getSuperstepNumber() To get the IterationRuntimeContext, the function needs to implement the Rich* interface. On Tue, May 26, 2015 at 4:01 PM, Pa Rö

Re: when return value from linkedlist or map and use in filter function display error

2015-06-09 Thread Robert Metzger
Great! I'm happy to hear that it worked. On Tue, Jun 9, 2015 at 5:28 PM, hagersaleh loveallah1...@yahoo.com wrote: I can solve problem when final MapString, Integer map = new HashMapString, Integer(); very thanks code run in command line not any error public static void

Re: flink k-means on hadoop cluster

2015-06-04 Thread Robert Metzger
...@googlemail.com wrote: [cloudera@quickstart bin]$ sudo su yarn bash-4.1$ hadoop fs -chmod 777 -chmod: Not enough arguments: expected 2 but got 1 Usage: hadoop fs [generic options] -chmod [-R] MODE[,MODE]... | OCTALMODE PATH... bash-4.1$ you understand? 2015-06-04 17:04 GMT+02:00 Robert Metzger

Re: flink k-means on hadoop cluster

2015-06-04 Thread Robert Metzger
:146) ... 23 more how i must set the files in the hdfs? quickstart.cloudera:50075/home/cloudera/output? 2015-06-04 16:51 GMT+02:00 Robert Metzger rmetz...@apache.org: Once you've started the YARN session, you can submit a Flink job with ./bin/flink run pathToYourJar. The jar file

Re: flink k-means on hadoop cluster

2015-06-04 Thread Robert Metzger
...@googlemail.com wrote: okay, now it run on my hadoop. how i can start my flink job? and where must the jar file save, at hdfs or as local file? 2015-06-04 16:31 GMT+02:00 Robert Metzger rmetz...@apache.org: Yes, you have to run these commands in the command line of the Cloudera VM. On Thu, Jun 4

Re: flink k-means on hadoop cluster

2015-06-04 Thread Robert Metzger
...@googlemail.com wrote: thanks, now i want run my app on cloudera live vm single node, how i can define my flink job with hue? i try to run the flink script in the hdfs, it's not work. best regards, paul 2015-06-02 14:50 GMT+02:00 Robert Metzger rmetz...@apache.org: I would recommend using HDFS

Re: flink k-means on hadoop cluster

2015-06-04 Thread Robert Metzger
]$ sudo su yarn bash-4.1$ hadoop fs -chmod 777 -chmod: Not enough arguments: expected 2 but got 1 Usage: hadoop fs [generic options] -chmod [-R] MODE[,MODE]... | OCTALMODE PATH... bash-4.1$ you understand? 2015-06-04 17:04 GMT+02:00 Robert Metzger rmetz...@apache.org: It looks like

Re: flink k-means on hadoop cluster

2015-06-04 Thread Robert Metzger
Yes, you have to run these commands in the command line of the Cloudera VM. On Thu, Jun 4, 2015 at 4:28 PM, Pa Rö paul.roewer1...@googlemail.com wrote: you mean run this command on terminal/shell and not define a hue job? 2015-06-04 16:25 GMT+02:00 Robert Metzger rmetz...@apache.org

Re: when return value from linkedlist or map and use in filter function display error

2015-06-07 Thread Robert Metzger
Hi, the problem is that map is a static field. Can you make the map field a non-static variable of the main method? That should resolve the issue. On Sun, Jun 7, 2015 at 2:57 PM, hagersaleh loveallah1...@yahoo.com wrote: when return value from linkedlist or map and use in filter function

Re: I want run flink program in ubuntu x64 Mult Node Cluster what is configuration?

2015-06-07 Thread Robert Metzger
Hi, this guide in our documentation should get you started: http://ci.apache.org/projects/flink/flink-docs-master/setup/cluster_setup.html You basically have to copy flink to all machines and put the hostnames into the slaves file. On Tue, Jun 2, 2015 at 4:00 PM, hagersaleh

Re: Documentation Error

2015-06-25 Thread Robert Metzger
Hey Maximilian Alber, I don't know if you are interested in contributing in Flink, but if you would like to, these small fixes to the documentation are really helpful for us! Its actually quite easy to work with the documentation locally. It is located in the docs/ directory of the Flink source.

Re: time measured for each iteration in KMeans

2015-06-26 Thread Robert Metzger
Hi, The TaskManager which is running the Sync task is logging when its starting the next iteration. I know its not very convenient. You can also log the time and Iteration id (from the IterationRuntimeContext) in the open() method. On Fri, Jun 26, 2015 at 9:57 AM, Pa Rö

Re: Flink Streaming State Management

2015-06-20 Thread Robert Metzger
Hey Hilmi, here is a great example of how to use the Checkpointed interface: https://github.com/StephanEwen/flink-demos/blob/master/streaming-state-machine/src/main/scala/com/dataartisans/flink/example/eventpattern/StreamingDemo.scala#L82 On Wed, Jun 17, 2015 at 12:44 AM, Hilmi Yildirim

Re: Flink 0.9 built with Scala 2.11

2015-06-20 Thread Robert Metzger
, Jun 14, 2015 at 8:03 PM Robert Metzger rmetz...@apache.org wrote: There was already a discussion regarding the two options here [1], back then we had a majority for giving all modules a scala suffix. I'm against giving all modules a suffix because we force our users to migrate the name and its

Re: Monitoring memory usage of a Flink Job

2015-06-20 Thread Robert Metzger
You don't have to enable the logging thread. You can also get the metrics of the job manager via the job manager web frontend. There, they also available in a JSON representation. So if you want, you can periodically (say every 5 seconds) do a HTTP request to get the metrics of all TMs. On Mon,

Re: Flink 0.9 built with Scala 2.11

2015-06-14 Thread Robert Metzger
, flink-scala, …, etc. with version variation. So we can reduce a number of deployed modules. Regards, Chiwan Park On Jun 13, 2015, at 9:17 AM, Robert Metzger rmetz...@apache.org wrote: I agree that we should ship a 2.11 build of Flink if downstream projects need that. The only thing

Re: Reading separate files in parallel tasks as input

2015-06-14 Thread Robert Metzger
Hi Daniel, Are the files in HDFS? what do you exactly mean by `readTextFile` wants to read the file on the JobManager ? The JobManager is not reading input files. Also, Flink is assigning input splits locally (when reading from distributed file systems). In the JobManager log you can see how many

Re: Flink 0.9 built with Scala 2.11

2015-06-12 Thread Robert Metzger
I agree that we should ship a 2.11 build of Flink if downstream projects need that. The only thing that we should keep in mind when doing this is that the number of jars we're pushing to maven will explode (but that is fine) We have currently 46 maven modules and we would create 4 versions of

Re: Best wishes for Kostas Tzoumas and Robert Metzger

2015-06-10 Thread Robert Metzger
to stream the talks or watch them later on? On Mon, Jun 8, 2015 at 2:54 AM, Hawin Jiang hawin.ji...@gmail.com wrote: Hi All As you know that Kostas Tzoumas and Robert Metzger will give us two Flink talks on 2015 Hadoop summit. That is an excellent opportunity to introduce Apache Flink

Re: bug? open() not getting called with RichWindowMapFunction

2015-06-03 Thread Robert Metzger
Since Flink is a community driven open source project, there is no fixed timeline. But there is a thread on this mailing list discussing the next release. My gut feeling (it depends on the community) tells me that we'll have a 0.9 release in the next one or two weeks. On Tue, May 26, 2015 at

Re: flink k-means on hadoop cluster

2015-06-02 Thread Robert Metzger
Hi, you can start Flink on YARN on the Cloudera distribution. See here for more: http://ci.apache.org/projects/flink/flink-docs-master/setup/yarn_setup.html These are the commands you need to execute wget

Re: flink k-means on hadoop cluster

2015-06-02 Thread Robert Metzger
Robert Metzger rmetz...@apache.org: Hi, you can start Flink on YARN on the Cloudera distribution. See here for more: http://ci.apache.org/projects/flink/flink-docs-master/setup/yarn_setup.html These are the commands you need to execute wget http://stratosphere-bin.s3-website-us-east-1

Re: Get file metadata

2015-07-01 Thread Robert Metzger
Hi Ronny, check out this answer on SO: http://stackoverflow.com/questions/30599616/create-objects-from-input-files-in-apache-flink It is a similar use case ... I guess you can get the metadata from the input split as well. On Wed, Jul 1, 2015 at 11:30 AM, Ronny Bräunlich r.braeunl...@gmail.com

Re: Get file metadata

2015-07-01 Thread Robert Metzger
: Hi Robert, just ignore my previous question. My files started with underscore and I just found out that FileInputFormat does filter for underscores in acceptFile(). Cheers, Ronny Am 01.07.2015 um 11:35 schrieb Robert Metzger rmetz...@apache.org: Hi Ronny, check out this answer on SO

Re: ArrayIndexOutOfBoundsException when running job from JAR

2015-06-29 Thread Robert Metzger
. Is this simply not possible or is it a design choice we made for some reason? -V. On 29 June 2015 at 09:53, Robert Metzger rmetz...@apache.org wrote: It is working in the IDE because there we execute everything in the same JVM, so the mapper can access the correct value of the static variable

Re: Custom Class for state checkpointing

2015-08-18 Thread Robert Metzger
Hi Rico, I'm pretty sure that this is a valid bug you've found, since this case is not yet tested (afaik). We'll fix the issue asap, until then, are you able to encapsulate your state in something that is available in Flink, for example a TupleX or just serialize it yourself into a byte[] ? On

Re: Serialization and kryo

2015-08-17 Thread Robert Metzger
Hi Jay, this is how you can register a custom Kryo serializer, yes. Flink has this project (https://github.com/magro/kryo-serializers) as a dependency. It contains a lot of Kryo Serializers for common types. They also added support for for Guava's ImmutableMap, but the version we are using (0.27)

Re: Custom Class for state checkpointing

2015-08-18 Thread Robert Metzger
for deserialization) via getRuntimeContext().getUserCodeClassLoader(). Let us know if that workaround works. We'll try to get a fix for that out very soon! Greetings, Stephan On Tue, Aug 18, 2015 at 12:23 PM, Robert Metzger rmetz...@apache.org wrote: Java's HashMap is serializable. If it is only

Re: Custom Class for state checkpointing

2015-08-18 Thread Robert Metzger
I'm still working on writing a test case for reproducing the issue. Which Flink version are you using? If you are using 0.10-SNAPSHOT, which exact commit? On Tue, Aug 18, 2015 at 2:09 PM, Robert Metzger rmetz...@apache.org wrote: I created a JIRA for the issue: https://issues.apache.org/jira

Re: Documentation Error

2015-06-30 Thread Robert Metzger
+1 lets remove the FAQ from the source repo and put it on the website only. On Thu, Jun 25, 2015 at 3:14 PM, Ufuk Celebi u...@apache.org wrote: On 25 Jun 2015, at 14:31, Maximilian Michels m...@apache.org wrote: Thanks for noticing, Chiwan. I have the feeling this problem arose when the

Re: Flink Kafka cannot find org/I0Itec/zkclient/serialize/ZkSerializer

2015-07-28 Thread Robert Metzger
Yes, I was running exactly that code. This is a repository containing the files: https://github.com/rmetzger/scratch/tree/flink-sbt-master Here is the program: https://github.com/rmetzger/scratch/blob/flink-sbt-master/src/main/scala/org/myorg/quickstart/Job.scala On Tue, Jul 28, 2015 at 2:01 AM,

Re: Options for the JVM in the TaskManagers

2015-07-29 Thread Robert Metzger
Hi Juan, there is a configuration option which is not documented in the 0.9 documentation: - env.java.opts: Set custom JVM options. This value is respected by Flink’s start scripts and Flink’s YARN client. This can be used to set different garbage collectors or to include remote

Re: extract workers' resources stats

2015-08-07 Thread Robert Metzger
Hi Stefanos, you can also write yourself a little script/tool which is periodically requesting the following JSON from the JobManager: http://localhost:8081/setupInfo?get=taskmanagers_=1438972693441 It returns a JSON string like this:

Re: java.lang.ClassNotFoundException when deploying streaming jar locally

2015-08-06 Thread Robert Metzger
Hi, how did you build the jar file? Have you checked whether your classes are in the jar file? On Thu, Aug 6, 2015 at 11:08 AM, Michael Huelfenhaus m.huelfenh...@davengo.com wrote: Hello everybody I am truing to build a very simple streaming application with the nightly build of flink

Re: Yarn configuration

2015-07-27 Thread Robert Metzger
a big machine (8 core, 30GB) on it? I thought it was the jm but it is not Il giorno 27/lug/2015, alle ore 14:56, Robert Metzger rmetz...@apache.org ha scritto: Hi Michele, no in an EMR configuration with 1 master and 5 core I have 5 active node in the resource manager…sounds

Re: Flink Kafka cannot find org/I0Itec/zkclient/serialize/ZkSerializer

2015-07-27 Thread Robert Metzger
Thank you for posting the full SBT files. I now understand why you exclude the kafka dependency from Flink. SBT does not support to read maven properties only defined in profiles. I will fix the issue for Flink 0.10 ( https://issues.apache.org/jira/browse/FLINK-2408) I was not able to reproduce

Re: Yarn configuration

2015-07-24 Thread Robert Metzger
Hi Michele, configuring a YARN cluster to allocate all available resources as good as possible is sometimes tricky, that is true. We are aware of these problems and there are actually the following two JIRAs for this: https://issues.apache.org/jira/browse/FLINK-937 (Change the YARN Client to

Re: Cluster installation gives java.lang.NoClassDefFoundError for everything

2015-11-11 Thread Robert Metzger
Is "jar tf /users/camelia/thecluster/flink-0.9.1/lib/flink-dist-0.9.1.jar" listing for example the "org/apache/flink/client/CliFrontend" class? On Wed, Nov 11, 2015 at 12:09 PM, Maximilian Michels wrote: > Hi Camelia, > > Flink 0.9.X supports Java 6. So this can't be the issue.

Re: finite subset of an infinite data stream

2015-11-11 Thread Robert Metzger
I think what you call "union" is a "connected stream" in Flink. Have a look at this example: https://gist.github.com/fhueske/4ea5422edb5820915fa4 It shows how to dynamically update a list of filters by external requests. Maybe that's what you are looking for? On Wed, Nov 11, 2015 at 12:15 PM,

Re: Hello World Flink 0.10

2015-11-13 Thread Robert Metzger
> OK, so as far as I understand the line ".trigger(new EOFTrigger)" is > simply not needed in this case (it looks that it works as expected when I > remove it). > > Thanks for your help. > > pt., 13.11.2015 o 11:01 użytkownik Robert Metzger <rmetz

Re: Apache Flink Operator State as Query Cache

2015-11-13 Thread Robert Metzger
Hi Welly, Flink 0.10.0 is out, its just not announced yet. Its available on maven central and the global mirrors are currently syncing it. This mirror for example has the update already: http://apache.mirror.digionline.de/flink/flink-0.10.0/ On Fri, Nov 13, 2015 at 9:50 AM, Welly Tambunan

Re: Join Stream with big ref table

2015-11-13 Thread Robert Metzger
Hi Arnaud, I'm happy that you were able to resolve the issue. If you are still interested in the first approach, you could try some things, for example using only one slot per task manager (the slots share the heap of the TM). Regards, Robert On Fri, Nov 13, 2015 at 9:18 AM, LINZ, Arnaud

Re: Hello World Flink 0.10

2015-11-13 Thread Robert Metzger
Hi Kamil, The EOFTrigger is not part of Flink. However, I've also tried implementing the Hello World from the presentation here: https://github.com/rmetzger/scratch/blob/flink0.10-scala2.11/src/main/scala/com/dataartisans/Job.scala Stephan Ewen told me that there is a more elegant way of

Re: Running on a firewalled Yarn cluster?

2015-11-02 Thread Robert Metzger
Hi Niels, so the problem is that you can not submit a job to Flink using the "/bin/flink" tool, right? I assume Flink and its TaskManagers properly start and connect to each other (the number of TaskManagers is shown correctly in the web interface). I see the following solutions for the problem

Re: IT's with POJO's

2015-11-05 Thread Robert Metzger
Hi Nick, both JIRA and the mailing list are good. In this case I'd say JIRA would be better because then everybody has the full context of the discussion. The issue is fixed in 0.10, which is not yet released. You can work around the issue by implementing a custom SourceFunction which returns

Re: Running continuously on yarn with kerberos

2015-11-05 Thread Robert Metzger
Hi Niels, thank you for analyzing the issue so properly. I agree with you. It seems that HDFS and HBase are using their own tokes which we need to transfer from the client to the YARN containers. We should be able to port the fix from Spark (which they got from Storm) into our YARN client. I think

Re: Running on a firewalled Yarn cluster?

2015-11-05 Thread Robert Metzger
t;> https://issues.apache.org/jira/browse/FLINK-2960 >> >> Best regards, >> Max >> >> On Tue, Nov 3, 2015 at 11:02 AM, Niels Basjes <ni...@basjes.nl> wrote: >> >>> Hi, >>> >>> I forgot to answer your other question: >>>

Re: FlinkKafkaConsumer bootstrap.servers vs. broker hosts

2015-10-14 Thread Robert Metzger
Hi Juho, sorry for the late reply, I was busy with Flink Forward :) The Flink Kafka Consumer needs both addresses. Kafka uses the bootstrap servers to connect to the brokers to consume messages. The Zookeeper connection is used to commit the offsets of the consumer group once a state snapshot in

Re: Building Flink with hadoop 2.6

2015-10-14 Thread Robert Metzger
Hi Gwen, can you tell us the "mvn" command you're using for building Flink? On Wed, Oct 14, 2015 at 4:37 PM, Gwenhael Pasquiers < gwenhael.pasqui...@ericsson.com> wrote: > Hi ; > > > > We need to test some things with flink and hadoop 2.6 (the trunc method). > > > > We’ve set up a build task

Re: flink kafka question

2015-10-14 Thread Robert Metzger
I would also suggest to create a mapper after the source. Make sure the mapper is chained to the kafka source, then, you'll not really see a big delay in the timestamp written to redis. Just out of curiosity, why do you need to write a timestamp to redis for each record from Kafka? On Wed, Oct

Re: Processing S3 data with Apache Flink

2015-10-06 Thread Robert Metzger
> > Thank you, > Konstantin Kudryavtsev > > On Mon, Oct 5, 2015 at 10:13 PM, Robert Metzger <rmetz...@apache.org> > wrote: > >> Hi Kostia, >> >> thank you for writing to the Flink mailing list. I actually started to >> try out our S3 File system

Re: Processing S3 data with Apache Flink

2015-10-06 Thread Robert Metzger
t; > WBR, > Kostia > > Thank you, > Konstantin Kudryavtsev > > On Tue, Oct 6, 2015 at 2:12 PM, Robert Metzger <rmetz...@apache.org> > wrote: > >> Mh. I tried out the code I've posted yesterday and it was working >> immediately. >> The security settings o

Re: [0.10-SNAPSHOT ] When naming yarn application (yarn-session -nm), flink run without -m fails.

2015-08-26 Thread Robert Metzger
because the system has been restarted anyways, this can actually be a problem if you want to resume a YARN cluster after you have restarted your system. On Wed, Aug 26, 2015 at 3:34 PM, Robert Metzger rmetz...@apache.org wrote: Yep. I think the start-*.sh scripts are also writing the PID to tmp

Re: [0.10-SNAPSHOT ] When naming yarn application (yarn-session -nm), flink run without -m fails.

2015-08-28 Thread Robert Metzger
, 2015 at 5:58 PM, Robert Metzger rmetz...@apache.org wrote: Therefore, my change will include a configuration option to set a custom location for the file. On Wed, Aug 26, 2015 at 5:55 PM, Maximilian Michels m...@apache.org wrote: The only problem with writing the temp

Re: Flink YARN Client requested shutdown in flink -m yarn-cluster mode?

2015-08-28 Thread Robert Metzger
Is the log from 0.9-SNAPSHOT or 0.10-SNAPSHOT? Can you send me (if you want privately as well) the full log of the yarn application: yarn logs -applicationId appId. We need to find out why the TaskManagers are shutting down. That is most likely logged in the TaskManager logs. On Fri, Aug 28,

Re: Flink YARN Client requested shutdown in flink -m yarn-cluster mode?

2015-08-28 Thread Robert Metzger
was to launch the job in detached mode (-yd) when my main function was not waiting after execution and was immediately ending. Sorry for my misunderstanding of this option. Best regards, Arnaud *De :* Robert Metzger [mailto:rmetz...@apache.org] *Envoyé :* vendredi 28 août 2015 11:03 *À

Re: Best way for simple logging in jobs?

2015-08-28 Thread Robert Metzger
Hi, Creating a slf4j logger like this: private static final Logger LOG = LoggerFactory.getLogger(PimpedKafkaSink.class); Works for me. The messages also end up in the regular YARN logs. Also system out should end up in YARN actually (when retrieving the logs from the log aggregation). Regards,

Re: [0.10-SNAPSHOT ] When naming yarn application (yarn-session -nm), flink run without -m fails.

2015-08-26 Thread Robert Metzger
nice to have an application write the configuration dir ; it’s often a root protected directory in usr/lib/flink. Is there a parameter to put that file elsewhere ? *De :* Robert Metzger [mailto:rmetz...@apache.org] *Envoyé :* mercredi 26 août 2015 14:42 *À :* user@flink.apache.org *Objet

Re: Custom Class for state checkpointing

2015-08-31 Thread Robert Metzger
he user code (needed for deserialization) > via "getRuntimeContext().getUserCodeClassLoader()". > > Let us know if that workaround works. We'll try to get a fix for that out > very soon! > > Greetings, > Stephan > > > > On Tue, Aug 18, 2015 at 12:23 PM, Robert Metzger &

Re: Travis updates on Github

2015-09-02 Thread Robert Metzger
Hi Sachin, I also noticed that the GitHub integration is not working properly. I'll ask the Apache Infra team. On Wed, Sep 2, 2015 at 10:20 AM, Sachin Goel wrote: > Hi all > Is there some issue with travis integration? The last three pull requests > do not have their

Re: Usage of Hadoop 2.2.0

2015-09-03 Thread Robert Metzger
I think most cloud providers moved beyond Hadoop 2.2.0. Google's Click-To-Deploy is on 2.4.1 AWS EMR is on 2.6.0 The situation for the distributions seems to be the following: MapR 4 uses Hadoop 2.4.0 (current is MapR 5) CDH 5.0 uses 2.3.0 (the current CDH release is 5.4) HDP 2.0 (October 2013)

Re: Performance Issue

2015-09-08 Thread Robert Metzger
u > load. But where are the information on the GC? > > > > Am 08.09.2015 um 09:34 schrieb Robert Metzger <rmetz...@apache.org>: > > The webinterface of Flink has a tab for the TaskManagers. There, you can > also see how much time the JVM spend with garbage collection. > Can

Re: Performance Issue

2015-09-08 Thread Robert Metzger
The webinterface of Flink has a tab for the TaskManagers. There, you can also see how much time the JVM spend with garbage collection. Can you check whether the number of GC calls + the time spend goes up after 30 minutes? On Tue, Sep 8, 2015 at 8:37 AM, Rico Bergmann

Re: The main method caused an error. HDFS flink

2015-09-08 Thread Robert Metzger
Which version of Flink are you using? Have you tried submitting the job using the "./bin/flink run" tool? On Tue, Sep 8, 2015 at 11:44 AM, Florian Heyl wrote: > Dear Sir or Madam, > Me and my colleague are developing a pipeline based on scala and java to > classify cancer

Re: The main method caused an error. HDFS flink

2015-09-08 Thread Robert Metzger
rect control of the cluster because our prof is > responsible for that. > The students are using the flink web submission client to upload their jar > and run it on the cluster. > > > Am 08.09.2015 um 12:48 schrieb Robert Metzger <rmetz...@apache.org>: > > Which version of Flink ar

Re: nosuchmethoderror

2015-09-03 Thread Robert Metzger
I'm sorry that we changed the method name between minor versions. We'll soon bring some infrastructure in place a) mark the audience of classes and b) ensure that public APIs are stable. On Wed, Sep 2, 2015 at 9:04 PM, Ferenc Turi wrote: > Ok. As I see only the method name

  1   2   3   4   5   6   7   8   9   10   >