回复:Flink job hangs/deadlocks (possibly related to out of memory)

2018-07-02 Thread Zhijiang(wangzhijiang999)
Hi Gerard, I think you can check the job manager log to find which task failed at first, and then trace the task manager log containing the failed task to find the initial reason. The failed task will trigger canceling all the other tasks, and during canceling process, the blocked task that is

MIs-reported metrics using SideOutput stream + Broadcast

2018-07-02 Thread Cliff Resnick
Our topology has a metadata source that we push via Broadcast. Because this metadata source is critical, but sometimes late, we added a buffering mechanism via a SideOutput. We call the initial look-up from Broadcast "join" and the secondary, state-backed buffered lookup, "late-join" Today I

Re: Existing files and directories are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite existing files and directories.

2018-07-02 Thread Hequn Cheng
Hi Mich, It seems the writeMode has not been set correctly. Have you ever tried > .writeAsText("/tmp/md_streaming.txt", FileSystem.WriteMode.OVERWRITE); On Mon, Jul 2, 2018 at 10:44 PM, Mich Talebzadeh wrote: > Flink 1.5 > > This streaming data written to a file > > val stream = env >

Description of Flink event time processing

2018-07-02 Thread Elias Levy
The documentation of how Flink handles event time and watermarks is spread across several places. I've been wanting a single location that summarizes the subject, and as none was available, I wrote one up. You can find it here:

Kafka Avro Table Source

2018-07-02 Thread Will Du
Hi folks, I am working on using avro table source mapping to kafka source. By looking at the example, I think the current Flink v1.5.0 connector is not flexible enough. I wonder if I have to specify the avro record class to read from Kafka. For withSchema, the schema can get from schema

Re: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.assign

2018-07-02 Thread Mich Talebzadeh
This is becoming very tedious. As suggested I changed the kafka dependency from ibraryDependencies += "org.apache.kafka" %% "kafka" % "1.1.0" to libraryDependencies += "org.apache.kafka" %% "kafka" % "0.11.0.0" and compiled and ran the job again anf failed. This is the log file 2018-07-02

Re: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.assign

2018-07-02 Thread Mich Talebzadeh
Hi, This is the code import java.util.Properties import java.util.Arrays import org.apache.flink.api.common.functions.MapFunction import org.apache.flink.api.java.utils.ParameterTool import org.apache.flink.streaming.api.datastream.DataStream import

Re: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.assign

2018-07-02 Thread Jörn Franke
Looks like a version issue , have you made sure that the Kafka version is compatible? > On 2. Jul 2018, at 18:35, Mich Talebzadeh wrote: > > Have you seen this error by any chance in flink streaming with Kafka please? > > org.apache.flink.client.program.ProgramInvocationException: >

Regarding external metastore like HIVE

2018-07-02 Thread Shivam Sharma
Hi, I have read Flink documentation that Flink supports Metastore which is currently InMemory. Is Flink community thinking to implement external Metastore like Hive? Thanks -- Shivam Sharma Data Engineer @ Goibibo Indian Institute Of Information Technology, Design and Manufacturing Jabalpur

java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.assign

2018-07-02 Thread Mich Talebzadeh
Have you seen this error by any chance in flink streaming with Kafka please? org.apache.flink.client.program.ProgramInvocationException: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.assign(Ljava/util/List;)V at

Re: The program didn't contain a Flink job

2018-07-02 Thread Rong Rong
Hmm. That's strange. Can you explain a little more on how your YARN cluster is set up and how you configure the submission context? Also, did you try submitting the jobs in detach mode? Is this happening from time to time for one specific job graph? Or it is consistently throwing the exception

Re: The program didn't contain a Flink job

2018-07-02 Thread eSKa
No. execute was called, and all calculation succeeded - there were job on dashboard with status FINISHED. after execute we had our logs that were claiming that everything succeded. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: The program didn't contain a Flink job

2018-07-02 Thread Rong Rong
Did you forget to call executionEnvironment.execute() after you define your Flink job? -- Rong On Mon, Jul 2, 2018 at 1:42 AM eSKa wrote: > Hello, We are currently running jobs on Flink 1.4.2. Our usecase is as > follows: > -service get request from customer > - we submit job to flink using

Re: Memory Leak in ProcessingTimeSessionWindow

2018-07-02 Thread ashish pok
All, I have been doing a little digging on this and to Stefan's point incrementing memory (not necessarily leak) was essentially because of keys that were incrementing as I was using time buckets concatenated with actual key to make unique sessions. Taking a couple of steps back, use case is

Existing files and directories are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite existing files and directories.

2018-07-02 Thread Mich Talebzadeh
Flink 1.5 This streaming data written to a file val stream = env .addSource(new FlinkKafkaConsumer09[String]("md", new SimpleStringSchema(), properties)) .writeAsText("/tmp/md_streaming.txt") env.execute("Flink Kafka Example") The error states Caused by:

Re: How to partition within same physical node in Flink

2018-07-02 Thread ashish pok
Thanks Fabian! It sounds like KeyGroup will do the trick if that can be made publicly accessible. On Monday, July 2, 2018, 5:43:33 AM EDT, Fabian Hueske wrote: Hi Ashish, hi Vijay, Flink does not distinguish between different parts of a key (parent, child) in the public APIs.

Re: How to partition within same physical node in Flink

2018-07-02 Thread Fabian Hueske
Hi Ashish, hi Vijay, Flink does not distinguish between different parts of a key (parent, child) in the public APIs. However, there is an internal concept of KeyGroups which is similar to what you call physical partitioning. A KeyGroup is a group of keys that are always processed on the same

run time error java.lang.NoClassDefFoundError: org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer09

2018-07-02 Thread Mich Talebzadeh
Hi, I created a jar file with sbt with this sbt file cat md_streaming.sbt name := "md_streaming" version := "1.0" scalaVersion := "2.11.8" libraryDependencies += "org.apache.hbase" % "hbase" % "1.2.3" libraryDependencies += "org.apache.hbase" % "hbase-client" % "1.2.6"

Dynamic Rule Evaluation in Flink

2018-07-02 Thread Aarti Gupta
Hi, We are currently evaluating Flink to build a real time rule engine that looks at events in a stream and evaluates them against a set of rules. The rules are dynamically configured and can be of three types - 1. Simple Conditions - these require you to look inside a single event. Example,

The program didn't contain a Flink job

2018-07-02 Thread eSKa
Hello,We are currently running jobs on Flink 1.4.2. Our usecase is as follows: -service get request from customer - we submit job to flink using YarnClusterClient Sometimes we have up to 6 jobs at the same time. >From time to time we got error as below: The program didn't contain a Flink job.

[ANNOUNCE] Weekly community update #27

2018-07-02 Thread Till Rohrmann
Dear community, this is the weekly community update thread #27. Please post any news and updates you want to share with the community to this thread. # Feature freeze and release date for Flink 1.6 The community is currently discussing the feature freeze and, thus, also the release date for

Re: Web history limit in flink 1.5

2018-07-02 Thread eSKa
thank you, I had to miss that option somehow :) -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

回复:Flink job hangs/deadlocks (possibly related to out of memory)

2018-07-02 Thread Zhijiang(wangzhijiang999)
Hi Gerard, From the below stack, it can only indicate the task is canceled that may be triggered by job manager becuase of other task failure. If the task can not be interrupted within timeout config, the task managerprocess will be exited. Do you see any OutOfMemory messages from the