Limit in batch flink sql job

2019-02-11 Thread yinhua.dai
Why flink said "Limiting the result without sorting is not allowed as it could lead to arbitrary results" when I use limit in batch mode? SELECT * FROM csvSource limit 10; -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Frequent Heartbeat timeout

2019-02-11 Thread sohimankotia
Hi, I am using Flink 1.5.5 . I have streaming job with 25 * 6 (150) parallelism . I am facing too frequent heartbeat timeout . Even during off peak hours to rule out memory issues . Also I enabled debug logs for flink and observed Heartbeat request is getting triggered every 5 seconds. *

Re: Dynamic Rules Creation For Flink CEP

2019-02-11 Thread dhanuka ranasinghe
I have written a blogpost though its not used CEP but I hope you can apply same thing for CEP as well. http://dhanuka84.blogspot.com/2019/02/rule-execution-as-streaming-process.html?m=1 On Tue, 12 Feb 2019, 11:46 Dian Fu This feature has not been supported yet. There is a JIRA( >

Re: Dynamic Rules Creation For Flink CEP

2019-02-11 Thread Dian Fu
This feature has not been supported yet. There is a JIRA(https://issues.apache.org/jira/browse/FLINK-7129) for this feature. > 在 2019年2月12日,上午9:32,Titus Rakkesh 写道: > > Dears, > We are having the use case of creating and uploading new patterns > dynamically and change the app behavior

Dynamic Rules Creation For Flink CEP

2019-02-11 Thread Titus Rakkesh
Dears, We are having the use case of creating and uploading new patterns dynamically and change the app behavior in runtime. Is the current Flink version support that? Thanks

Re: Is there a windowing strategy that allows a different offset per key?

2019-02-11 Thread Rong Rong
getKey(IN value)Hi Stephen, Yes, we had a discussion regarding for dynamic offsets and keys [1]. The main idea was the same: we don't have many complex operators after the window operator, thus a huge spike of traffic will occur after firing on the window boundary. In the discussion the best idea

Re: Can an Aggregate the key from a WindowedStream.aggregate()

2019-02-11 Thread Rong Rong
Hi Stephen, Chesney was right, you will have to use a more complex version of the window processing function. Perhaps your goal can be achieve by this specific function with incremental aggregation [1]. If not you can always use the regular process window function [2]. Both of these methods have

Flink Job sometimes does not stop as expected on cancelling.

2019-02-11 Thread Mahantesh Patil
Hi Team, We have Flink jobs running in cluster mode. When I cancel the job and check for status it still shows as running. Below is logs generated. I do not see any useful information. Could you guys point me in a right direction to debug this , our flink jobs do heavy processing , my

Re: How to load multiple same-format files with single batch job?

2019-02-11 Thread françois lacombe
Hi Fabian, I've got issues for a custom InputFormat implementation with my existing code. Is this can be used in combination with a BatchTableSource custom source? As I understand your solution, I should move my source to implementations like : tableEnvironment .connect(...)

Re: stream of large objects

2019-02-11 Thread Aggarwal, Ajay
I looked a little into broadcast state and while its interesting I don’t think it will help me. Since broadcast state is kept all in-memory, I am worried about memory requirement if I make all these LargeMessages part of broadcast state. Furthermore these LargeMessages need to be processed in a

Re: Anyone tried to do blue-green topology deployments?

2019-02-11 Thread Stephen Connolly
On Mon, 11 Feb 2019 at 14:10, Stephen Connolly < stephen.alan.conno...@gmail.com> wrote: > Another possibility would be injecting pseudo events into the source and > having a stateful filter. > > The event would be something like “key X is now owned by green”. > > I can do that because getting a

Re: Flink Standalone cluster - logging problem

2019-02-11 Thread Gary Yao
Hi, Are you logging from your own operator implementations, and you expect these log messages to end up in a file prefixed with XYZ-? If that is the case, modifying log4j-cli.properties will not be expedient as I wrote earlier. You should modify the log4j.properties on all hosts that are running

Re: Anyone tried to do blue-green topology deployments?

2019-02-11 Thread Stephen Connolly
Another possibility would be injecting pseudo events into the source and having a stateful filter. The event would be something like “key X is now owned by green”. I can do that because getting a list of keys seen in the past X minutes is cheap (we have it already) But it’s unclear what impact

Re: Anyone tried to do blue-green topology deployments?

2019-02-11 Thread Stephen Connolly
On Mon, 11 Feb 2019 at 13:26, Stephen Connolly < stephen.alan.conno...@gmail.com> wrote: > I have my main application updating with a blue-green deployment strategy > whereby a new version (always called green) starts receiving an initial > fraction of the web traffic and then - based on the

Anyone tried to do blue-green topology deployments?

2019-02-11 Thread Stephen Connolly
I have my main application updating with a blue-green deployment strategy whereby a new version (always called green) starts receiving an initial fraction of the web traffic and then - based on the error rates - we progress the % of traffic until 100% of traffic is being handled by the green

Re: No resource available error while testing HA

2019-02-11 Thread Gary Yao
Hi Averell, Logback has this feature [1] but is not enabled out of the box. You will have to enable the JMX agent by setting the com.sun.management.jmxremote system property [2][3]. I have not tried this out, though. Best, Gary [1] https://logback.qos.ch/manual/jmxConfig.html [2]

Re: Flink Standalone cluster - logging problem

2019-02-11 Thread simpleusr
Hi Selveraj, This did not help either. Thanks -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink Standalone cluster - logging problem

2019-02-11 Thread Selvaraj chennappan
Could you pls try modifying conf/logback.xml . Regards, Selvaraj C On Mon, Feb 11, 2019 at 4:32 PM simpleusr wrote: > Hi Gary, > > By "job logs" I mean all the loggers under a subpackage of > com.mycompany.xyz > . > > We are using ./bin/flink run command for job execution thats why I modified

Re: Flink Standalone cluster - logging problem

2019-02-11 Thread simpleusr
Hi Gary, By "job logs" I mean all the loggers under a subpackage of com.mycompany.xyz . We are using ./bin/flink run command for job execution thats why I modified log4j-cli.properties. Modification of log4j.properties also did not help... Regards -- Sent from:

Re: Flink Standalone cluster - logging problem

2019-02-11 Thread Gary Yao
Hi, Can you define what you mean by "job logs"? For code that is run on the cluster, i.e., JM or TM, you should add your config to log4j.properties. The log4j-cli.properties file is only used by the Flink CLI process. Best, Gary On Mon, Feb 11, 2019 at 7:39 AM simpleusr wrote: > Hi Chesnay, >

Re: Reduce one event under multiple keys

2019-02-11 Thread Stephen Connolly
On Mon, 11 Feb 2019 at 09:42, Fabian Hueske wrote: > Hi Stephen, > > A window is created with the first record that is assigned to it. > If the windows are based on time and a key, than no window will be created > (and not space be occupied) if there is not a first record for a key and > time

Re: Is there a windowing strategy that allows a different offset per key?

2019-02-11 Thread Stephen Connolly
On Mon, 11 Feb 2019 at 09:54, Fabian Hueske wrote: > Hi Stephen, > > First of all, yes, windows computing and emitting at the same time can > cause pressure on the downstream system. > > There are a few ways how you can achieve this: > * use a custom window assigner. A window assigner decides

Re: fllink 1.7.1 and RollingFileSink

2019-02-11 Thread Fabian Hueske
Hi Vishal, Kostas (in CC) should be able to help here. Best, Fabian Am Mo., 11. Feb. 2019 um 00:05 Uhr schrieb Vishal Santoshi < vishal.santo...@gmail.com>: > Any one ? > > On Sun, Feb 10, 2019 at 2:07 PM Vishal Santoshi > wrote: > >> You don't have to. Thank you for the input. >> >> On Sun,

Re: Is there a windowing strategy that allows a different offset per key?

2019-02-11 Thread Fabian Hueske
Hi Stephen, First of all, yes, windows computing and emitting at the same time can cause pressure on the downstream system. There are a few ways how you can achieve this: * use a custom window assigner. A window assigner decides into which window a record is assigned. This is the approach you

Re: Reduce one event under multiple keys

2019-02-11 Thread Fabian Hueske
Hi Stephen, A window is created with the first record that is assigned to it. If the windows are based on time and a key, than no window will be created (and not space be occupied) if there is not a first record for a key and time interval. Anyway, if tracking the number of open files & average

Re: HA HDFS

2019-02-11 Thread Vishal Santoshi
One more thing I had to do. In HA set up, the TMs are not able to resolve the job manager's random port through the jobmanager.rpc.port setting. Setup high-availability.jobmanager.port as a predefined

Re: Broadcast state before events stream consumption

2019-02-11 Thread Konstantin Knauf
Hi Chirag, Hi Vadim, from the top of my head, I see two options here: * Buffer the "fast" stream inside the KeyedBroadcastProcessFunction until relevant (whatever this means in your use case) broadcast events have arrived. Advantage: operationally easy, events are emitted as early as possible.

Re: HA HDFS

2019-02-11 Thread Konstantin Knauf
Hi Steve, HDFS can be used as checkpoint storage and plays a crucial role in Flink's fault tolerance mechanism. HDFS alone will not suffice to get a Flink HA setup, though. You also need Zookeeper for JobManager HA. Flink configuration: high-availability: zookeeper

Re: Sliding window buffering on restart without save point

2019-02-11 Thread Konstantin Knauf
Hi William, first of all, I would like to give you two pointers regarding state migration: * If you set UUIDs for all operators you can change the topology of your job without breaking the compatibility with the savepoint [1]. State will be matched to the operator with the same UUID. * In Flink

Re: Running JobManager as Deployment instead of Job

2019-02-11 Thread Till Rohrmann
Hi Vishal, you can also keep the same cluster id when cancelling a job with savepoint and then resuming a new job from it. Terminating the job should clean up all state in Zk. Cheers, Till On Fri, Feb 8, 2019 at 11:26 PM Vishal Santoshi wrote: > In one case however, we do want to retain the

Re: long lived standalone job session cluster in kubernetes

2019-02-11 Thread Till Rohrmann
Hi Heath, I just learned that people from Alibaba already made some good progress with FLINK-9953. I'm currently talking to them in order to see how we can merge this contribution into Flink as fast as possible. Since I'm quite busy due to the upcoming release I hope that other community members