??????kafka consumer exception

2019-02-19 Thread ForwardXu
kafka consumerclient-id??flink??client-id??kafka??jira??

flink sink kafka exception

2019-02-19 Thread 董鹏
flink大神,你们好。flink sink kafka 遇到这个异常,不影响job运行,不影响结果,偶尔抛出。向你们请教一下,希望获取些思路。2019-02-20 10:08:46.889 +0800 [Source: rn -> Flat Map -> async wait operator -> async wait operator -> Sink: Unnamed (17/20)] ERROR [org.apache.flink.streaming.runtime.tasks.StreamTask] [StreamTask.java:481] - Error during

kafka consumer exception

2019-02-19 Thread 董鹏
flink大神你们好,在使用flink on kafka(1.0版本) 遇到如下异常: 不影响job,不影响结果,对于这个异常偶尔打出,你们是否有遇到这个问题呢? [org.apache.kafka.common.utils.AppInfoParser] [AppInfoParser.java:60] - Error registering AppInfo mbean javax.management.InstanceAlreadyExistsException: kafka.consumer:type=app-info,id=consumer-31 at

Re: Reading messages from start - new job submission

2019-02-19 Thread Dian Fu
Hi Avi, As described in the documentation: "If offsets could not be found for a partition, the auto.offset.reset setting in the properties will be used.". For starting from GroupOffset, the property "auto.offset.reset" will ONLY be respected when the group offset cannot be found for a

Re: EXT :Re: How to debug difference between Kinesis and Kafka

2019-02-19 Thread Dian Fu
DataStream.assignTimestampsAndWatermarks will add a watermark generator operator after each source operator(if their parallelism is the same which is true for the code you showed) and so if one instance of the source operator has no data, the corresponding watermark generator operator cannot

How to use my custom log4j.properties when running minicluster in idea

2019-02-19 Thread peibin wang
Hi,    I am running flink job in the Intellij IDEA  with mini cluster (not submit it to the flink cluster ) for  convenience . Now I have put my custom log config file ( both log4j.properties and logback.xml)  in src/main/resources/. But it does not work. Is there any solutions?

Re: Starting Flink cluster and running a job

2019-02-19 Thread Boris Lublinsky
Thanks Ken, That was my first instinct as well, but.. To run on the cluster I am building an uber jar for which I am fixing Kafka clients jar version I am also fixing version of Kafka So I do not know where another version can get from Boris Lublinsky FDP Architect boris.lublin...@lightbend.com

Re: Starting Flink cluster and running a job

2019-02-19 Thread Ken Krugler
Hi Boris, I haven’t seen this exact error, but I have seen similar errors caused by multiple versions of jars on the classpath. When I’ve run into this particular "XXX is not an instance of YYY" problem, it often seems to be caused by a jar that I should have marked as provided in my pom.

Re: Jira issue Flink-11127

2019-02-19 Thread Boris Lublinsky
Thanks Konstantin Unfortunately it does not work The snippet from task manager yaml is containers: - name: taskmanager image: {{ .Values.image }}:{{ .Values.imageTag }} imagePullPolicy: {{ .Values.imagePullPolicy }} args: - taskmanager -Dtaskmanager.host=$(K8S_POD_IP) ports: - name:

Re: Starting Flink cluster and running a job

2019-02-19 Thread Boris Lublinsky
Konstantin, After experimenting with this for a while, I got to the root cause of the problem I am running a version of a Taxi ride travel prediction as my sample. It works fine in Intellij, But when I am trying to put it in the docker (standard Debian 1.7 image) It fails with a following error

Assigning timestamps and watermarks several times, several datastreams?

2019-02-19 Thread Aakarsh Madhavan
Hi! Currently I am using Flink 1.4.2. class TSWM implements AssignerWithPunctuatedWatermarks { long maxTS = Long.MIN_VALUE; @Override public Watermark checkAndGetNextWatermark(POJO event, long l) { maxTS = Math.max(maxTS, event.TS); return new Watermark(getMaxTimestamp()); }

Metrics for number of "open windows"?

2019-02-19 Thread Andrew Roberts
Hello, I’m trying to track the number of currently-in-state windows in a keyed, windowed stream (stream.keyBy(…).window(…).trigger(…).process(…)) using Flink metrics. Are there any built in? Or any good approaches for collecting this data? Thanks, Andrew -- *Confidentiality Notice: The

Re: Reading messages from start - new job submission

2019-02-19 Thread avilevi
Thanks for the answer, But my question is why do I need to set /myConsumer.setStartFromEarliest();/ if I set this property /setProperty("auto.offset.reset", "earliest") /in consumer properties ? I want the consumer to start reading from earliest only If offsets could not be found as stated in

Re: EXT :Re: How to debug difference between Kinesis and Kafka

2019-02-19 Thread Stephen Connolly
Though I am explicitly assigning watermarks with DataStream.assignTimestampsAndWatermarks and I see all the data flowing through that... so shouldn't that override the watermarks from the original source? On Tue, 19 Feb 2019 at 15:59, Martin, Nick wrote: > Yeah, that’s expected/known.

Re: FLIP-16, FLIP-15 Status Updates?

2019-02-19 Thread John Tipper
Hi Timo, That’s great, thank you very much. If I’d like to contribute, is it best to wait until the roadmap has been published? And is this the best list to ask on, or is the development mailing list better? Many thanks, John Sent from my iPhone > On 19 Feb 2019, at 16:29, Timo Walther

Re: FLIP-16, FLIP-15 Status Updates?

2019-02-19 Thread Timo Walther
Hi John, you are right that there was not much progress in the last years around these two FLIPs. Mostly due to shift of priorities. However, with the big Blink code contribution from Alibaba and joint development forces for a unified batch and streaming runtime [1], it is very likely that

Re: Dataset statistics

2019-02-19 Thread Flavio Pompermaier
We've just published a first attempt (on Flink 1.6.2) that extract some descriptive statistics from a batch dataset[1]. Any feedback is welcome. Best, Flavio [1] https://github.com/okkam-it/flink-descriptive-stats On Thu, Feb 14, 2019 at 11:19 AM Flavio Pompermaier wrote: > No effort in this

RE: EXT :Re: How to debug difference between Kinesis and Kafka

2019-02-19 Thread Martin, Nick
Yeah, that’s expected/known. Watermarks for the empty partition don’t advance, so the window in your window function never closes. There’s a ticket open to fix it (https://issues.apache.org/jira/browse/FLINK-5479) for the kafka connector, but in general any time one parallel instance of a

Re: Get nested Rows from Json string

2019-02-19 Thread françois lacombe
Hi Rong, Thank you for JIRA. Understood it may be solved in a next release, I'll comment the ticket in case of further input All the best François Le sam. 9 févr. 2019 à 00:57, Rong Rong a écrit : > Hi François, > > I just did some research and seems like this is in fact a Stringify issue. >

Re: How to load multiple same-format files with single batch job?

2019-02-19 Thread françois lacombe
Hi Fabian, After a bit more documentation reading I have a better understanding of how InputFormat interface works. Indeed I've better to wrap a custom InputFormat implementation in my source. This article helps a lot https://brewing.codes/2017/02/06/implementing-flink-batch-data-connector/

Re: How to debug difference between Kinesis and Kafka

2019-02-19 Thread Stephen Connolly
Hmmm my suspicions are now quite high. I created a file source that just replays the events straight then I get more results On Tue, 19 Feb 2019 at 11:50, Stephen Connolly < stephen.alan.conno...@gmail.com> wrote: > Hmmm after expanding the dataset such that there was additional data that >

Re: How to debug difference between Kinesis and Kafka

2019-02-19 Thread Stephen Connolly
Hmmm after expanding the dataset such that there was additional data that ended up on shard-0 (everything in my original dataset was coincidentally landing on shard-1) I am now getting output... should I expect this kind of behaviour if no data arrives at shard-0 ever? On Tue, 19 Feb 2019 at

FLIP-16, FLIP-15 Status Updates?

2019-02-19 Thread John Tipper
Hi All, Does anyone know what the current status is for FLIP-16 (loop fault tolerance) and FLIP-15 (redesign iterations) please? I can see lots of work back in 2016, but it all seemed to stop and go quiet since about March 2017. I see iterations as offering very interesting capabilities for

How to debug difference between Kinesis and Kafka

2019-02-19 Thread Stephen Connolly
Hi, I’m having a strange situation and I would like to know where I should start trying to debug. I have set up a configurable swap in source, with three implementations: 1. A mock implementation 2. A Kafka consumer implementation 3. A Kinesis consumer implementation >From injecting a log and

Re: Starting Flink cluster and running a job

2019-02-19 Thread Konstantin Knauf
Hi Boris, without looking at the entrypoint in much detail, generally there should not be a race condition there: * if the taskmanagers can not connect to the resourcemanager they will retry (per default the timeout is 5 mins) * if the JobManager does not get enough resources from the

Re: Jira issue Flink-11127

2019-02-19 Thread Konstantin Knauf
Hi Boris, the solution is actually simpler than it sounds from the ticket. The only thing you need to do is to set the "taskmanager.host" to the Pod's IP address in the Flink configuration. The easiest way to do this is to pass this config dynamically via a command-line parameter. The Deployment

Stream enrichment with static data, side inputs for DataStream

2019-02-19 Thread Artur Mrozowski
Hi, I have a stream of buildings and each building has foreign key reference to municipality. Municipalities data set is quite static. Both are placed on Kafka topics. I want to enrich each building with municipality name. FLIP 17, proposal would be ideal for this use case but it's still just a

Re: subscribe

2019-02-19 Thread Artur Mrozowski
Will do, thanks! On Tue, Feb 19, 2019 at 8:57 AM Fabian Hueske wrote: > Hi Artur, > > In order to subscribe to Flink's user mailing list you need to send a mail > to user-subscr...@flink.apache.org > > Best, Fabian > > Am Mo., 18. Feb. 2019 um 20:34 Uhr schrieb Artur Mrozowski < >