Re: Apache Flink - CEP vs SQL detecting patterns

2019-04-25 Thread Dawid Wysakowicz
Hi, Unfortunately those are just ignored. The timed out partial matches are not emitted. Best, Dawid On 20/04/2019 19:49, M Singh wrote: > Dawid: > > So, what happens when there is a timeout - is there any value/field in > the resulting data stream that indicates that this was a timeout ? > >

Re: Create Custom Sink for DataSet

2019-04-25 Thread Dawid Wysakowicz
Hi Soheil, The equivalent of DataStream's SinkFunction in DataSet API is the mentioned OutputFormat. You can implement the OutputFormat. Best, Dawid On 21/04/2019 20:01, Soheil Pourbafrani wrote: > Hi, Using the DataStream API I could create a Custom Sink > like classRichMySqlSink extends

Re: Constant backpressure on flink job

2019-04-25 Thread Dawid Wysakowicz
Hi Monika, I would start with identifying the operator that causes backpressure. More information how to monitor backpressure you can find here in the docs[1]. You might also be interested in Seth's (cc'ed) webinar[2], where he also talks how to debug backpressure. Best, Dawid [1]

Re: Watermark for each key?

2019-04-25 Thread Congxian Qiu
There was someone working in IoT asking me whether Flink supports per-key watermark also. I’m not sure if we can do the statistics by using raw state manipulating. We create a single state for every single key, and when receiving a key, we extract the timestamp and to see if we need to send

Query - External SSL setup

2019-04-25 Thread L Jainkeri, Suman (Nokia - IN/Bangalore)
Hi, I am trying to authenticate Flink using NGINX. In the document it is mentioned to deploy a "side car proxy", here is the link for the section of the document which I have referred to

Zeppelin

2019-04-25 Thread Smirnov Sergey Vladimirovich (39833)
Hello, Trying to link Zeppelin 0.9 with Flink 1.8. It`s a small dev cluster deployed in standalone manner. Got the same error as described here https://stackoverflow.com/questions/54257671/runnning-a-job-in-apache-flink-standalone-mode-on-zeppelin-i-have-this-error-to Would appreciate for any

Re: State Migration with RocksDB MapState

2019-04-25 Thread Tzu-Li (Gordon) Tai
Hi Cliff, Thanks for bringing this up again. I think it would make sense to at least move this forward be only exclusively checking the schema for user keys in MapState, and allow value schema evolution. I'll comment on the JIRA about this, and also make it a blocker for 1.9.0 to make sure it

Re: Flink CLI

2019-04-25 Thread Marc Rooding
Hi Steven, Oytun You may find the tool we open-sourced last year useful. It offers deploying and updating jobs with savepointing. You can find it on Github: https://github.com/ing-bank/flink-deployer There’s also a docker image available in Docker Hub. Marc On 24 Apr 2019, 17:29 +0200, Oytun

Re: RichAsyncFunction Timer Service

2019-04-25 Thread Dawid Wysakowicz
Hi Mike, I think the reason why there is no access to TimerService in async function is that as it is an async function, there are no guarantees when/and where(at which stage of the pipeline) the function will actually be executed. This characteristic doesn't align with TimerService and timely

Re: Getting JobExecutionException: Could not set up JobManager when trying to upload new version

2019-04-25 Thread Dawid Wysakowicz
Hi Avi, Just as some additional explanation. UID of operator is the way we map state to corresponding operator. This allows loading savepoint with changed DAG as long as the UIDs stay the same. This as you said explain why you got the exception when you changed uid of some of the operators.

Re: TM occasionally hang in deploying state in Flink 1.5

2019-04-25 Thread qi luo
Hello, This issue occurred again and we dumped the TM thread. It indeed hung on socket read to download jar from Blob server: "DataSource (at createInput(ExecutionEnvironment.java:548) (our.code)) (1999/2000)" #72 prio=5 os_prio=0 tid=0x7fb9a1521000 nid=0xa0994 runnable

Re: Identify orphan records after joining two streams

2019-04-25 Thread Dawid Wysakowicz
Hi Averell, I think your original solution is the right one, given your requirements. I don't think it is over complicated. As for the memory concerns, there is no bult-in mechanism for backpressure/alignment based on event time. The community did take that into consideration when discussing the

Re: kafka partitions, data locality

2019-04-25 Thread Dawid Wysakowicz
Hi Smirnov, Actually there is a way to tell Flink that data is already partitioned. You can try the reinterpretAsKeyedStream[1] method. I must warn you though this is an experimental feature. Best, Dawid [1]

Re: assignTimestampsAndWatermarks not work after KeyedStream.process

2019-04-25 Thread Dawid Wysakowicz
Hi, Yes I think your explanation is correct. I can also recommend Seth's webinar where he talks about debugging Watermarks[1] Best, Dawid [1] https://www.ververica.com/resources/webinar/webinar/debugging-flink-tutorial On 22/04/2019 22:55, an0 wrote: > Thanks, I feel I'm getting closer to the

Re: Watermark for each key?

2019-04-25 Thread Till Rohrmann
Your proposal could probably also be implemented by using Flink's support for allowed lateness when defining a window [1]. It has basically the same idea that there might be some elements which violate the watermark semantics and which need to be handled separately. [1]

Re: QueryableState startup regression in 1.8.0 ( migration from 1.7.2 )

2019-04-25 Thread Dawid Wysakowicz
Hi Vishal, As Guowei mentioned you have to enable the Queryable state. The default setting was changed in 1.8.0. There is an open JIRA[1] for changing the documentation accordingly. Best, Dawid [1] https://issues.apache.org/jira/browse/FLINK-12274 On 25/04/2019 03:27, Guowei Ma wrote: > You

Re: Job Startup Arguments

2019-04-25 Thread Chesnay Schepler
The passed job arguments can not be queried via the REST API. When submitting jobs through the CLI these parameters never arrive at the cluster; in case of REST API submission they are immediately discarded after the submission has finished. On 25/04/2019 12:25, Dawid Wysakowicz wrote: Hi

Re: State Migration with RocksDB MapState

2019-04-25 Thread Cliff Resnick
Great news! Thanks On Thu, Apr 25, 2019, 2:59 AM Tzu-Li (Gordon) Tai wrote: > Hi Cliff, > > Thanks for bringing this up again. > > I think it would make sense to at least move this forward be only > exclusively checking the schema for user keys in MapState, and allow value > schema evolution. >

Re: Zeppelin

2019-04-25 Thread Dawid Wysakowicz
Hi Sergey, I am not very familiar with Zepellin. But I know Jeff (cc'ed) is working on integrating Flink with some notebooks. He might be able to help you. Best, Dawid On 25/04/2019 08:42, Smirnov Sergey Vladimirovich (39833) wrote: > > Hello, > >   > > Trying to link Zeppelin 0.9 with Flink

Re: TM occasionally hang in deploying state in Flink 1.5

2019-04-25 Thread Dawid Wysakowicz
Hi, Feel free to open a JIRA for this issue. By the way have you investigated what is the root cause for it hanging? Best, Dawid On 25/04/2019 08:55, qi luo wrote: > Hello, > > This issue occurred again and we dumped the TM thread. It indeed hung > on socket read to download jar from Blob

Re: [Discuss] Add JobListener (hook) in flink job lifecycle

2019-04-25 Thread Becket Qin
Thanks for the proposal, Jeff. Adding a listener to allow users handle events during the job lifecycle makes a lot of sense to me. Here are my two cents. * How do user specify the listener? * It is not quite clear to me whether we consider ClusterClient as a public interface? From what I

Re: Flink Control Stream

2019-04-25 Thread Dominik Wosiński
Thanks for help Till, I thought so, but I wanted to be sure. Best Regards, Dom.

Re: Job Startup Arguments

2019-04-25 Thread Dawid Wysakowicz
Hi Steve, As far as I know, this information is not available in REST API, but it would be good to double check with Chesnay(cc'ed). You can see the complete list of available REST commands here[1]. Best, Dawid [1]

Re: kafka corrupt record exception

2019-04-25 Thread Dominik Wosiński
Hey, Sorry for such a delay, but I have missed this message. Basically, technically you could have Kafka broker installed in version say 1.0.0 and using FlinkKafkaConsumer08. This could technically create issues. I'm not sure if You can automate the process of skipping corrupted messages, as You

Re: Zeppelin

2019-04-25 Thread Jeff Zhang
Thanks Dawid, Hi Sergey, I am working on update the flink interpreter of zeppelin to support flink 1.9 (supposed to be released this summer). For the current flink interpreter of zeppelin 0.9, I haven't verified it against flink 1.8. could you show the full interpreter log ? And what is the size

Re: Taskmanager unable to rejoin job manager

2019-04-25 Thread Mar_zieh
Hello I want to run flink on apache Mesos with Marathon and I configure Zookeeper too; so I run "mesos-appmaster.sh"; but it shows me this error: 2019-04-25 13:53:18,160 INFO org.apache.flink.mesos.runtime.clusterframework.MesosResourceManager - Mesos resource manager started. 2019-04-25

Re: QueryableState startup regression in 1.8.0 ( migration from 1.7.2 )

2019-04-25 Thread Vishal Santoshi
Ditto that, queryable-state.enable to true works. Thanks everyone. On Thu, Apr 25, 2019 at 6:28 AM Dawid Wysakowicz wrote: > Hi Vishal, > > As Guowei mentioned you have to enable the Queryable state. The default > setting was changed in 1.8.0. There is an open JIRA[1] for changing the >

Re: [Discuss] Add JobListener (hook) in flink job lifecycle

2019-04-25 Thread Jeff Zhang
Hi Beckett, Thanks for your feedback, See my comments inline >>> How do user specify the listener? * What I proposal is to register JobListener in ExecutionEnvironment. I don't think we should make ClusterClient as public api. >>> Where should the listener run? * I don't think it is proper to

Re: Flink CLI

2019-04-25 Thread Oytun Tez
I had come across flink-deployer actually, but somehow didn't want to "learn" it... (versus just a bunch of lines in a script) At some time with more bandwidth, we should migrate to this one and standardize flink-deployer (and later take this to mainstream Flink :P). --- Oytun Tez *M O T A W O

read only mode for Flink UI

2019-04-25 Thread uday bhaskar
Hi We are looking at running Flink on Kubernetes in Job cluster mode. As part of our plans we do not want to allow modifications to the job cluster once a job is running. For this we are looking at a "read-only" Flink UI, that does not allow users to cancel a job or submit a job. My question is,

Re: No zero ( 2 ) exit code on k8s StandaloneJobClusterEntryPoint when save point with cancel...

2019-04-25 Thread Vishal Santoshi
Here you go https://issues.apache.org/jira/browse/FLINK-12333 Again thanks for the prompt response On Wed, Apr 24, 2019 at 11:06 AM Till Rohrmann wrote: > Good to hear. Could you create a documentation JIRA issue for this > problem? Thanks a lot. > > Cheers, > Till > > On Wed, Apr 24, 2019

Re: assignTimestampsAndWatermarks not work after KeyedStream.process

2019-04-25 Thread an0
If my understanding is correct, then why `assignTimestampsAndWatermarks` before `keyBy` works? The `timeWindowAll` stream's input streams are task 1 and task 2, with task 2 idling, no matter whether `assignTimestampsAndWatermarks` is before or after `keyBy`, because whether task 2 receives

RE: EXT :read only mode for Flink UI

2019-04-25 Thread Martin, Nick
AFAIK, there are no granular permissions like that built into Flink. Limiting access to the REST API seems like a good place to start. The web UI uses the API, but controlling it there means you’re locking down all means of access. The designers of the API were disciplined about what HTTP verbs

Re: Exceptions when launching counts on a Flink DataSet concurrently

2019-04-25 Thread Juan Rodríguez Hortalá
Any thoughts on this? On Sun, Apr 7, 2019, 6:56 PM Juan Rodríguez Hortalá < juan.rodriguez.hort...@gmail.com> wrote: > Hi, > > I have a very simple program using the local execution environment, that > throws NPE and other exceptions related to concurrent access when launching > a count for a

Emitting current state to a sink

2019-04-25 Thread Avi Levi
Hi, We have a keyed pipeline with persisted state. Is there a way to broadcast a command and collect all values that persisted in the state ? The end result can be for example sending a fetch command to all operators and emitting the results to some sink why do we need it ? from time to time we

Re: How to implement custom stream operator over a window? And after the Count-Min Sketch?

2019-04-25 Thread Rong Rong
Hi Felipe, I am not sure the algorithm requires to construct a new extension of the window operator. I think your implementation of the CountMinSketch object as an aggregator: E.g. 1. AggregateState (ACC) should be the aggregating accumulate count-min-sketch 2-D hash array (plus a few other

Re: taskmanager faild

2019-04-25 Thread Xintong Song
hi naisili, 我没有在你的邮件里看到任何附件、截图或者文字描述的错误,麻烦你再确认一次。 Thank you~ Xintong Song On Fri, Apr 26, 2019 at 10:46 AM naisili Yuan wrote: > 还是集群稳定性问题,发现了这个错误,我想问下是不是我配置集群高可用的问题,是否不依赖zookeeper会更稳定一点。 > 希望得到回复,谢谢! > > naisili Yuan 于2019年4月22日周一 下午2:23写道: > >> 不好意思,我忘记贴图了。 >> 我的flink