Re: [SURVEY] Custom RocksDB branch

2019-01-22 Thread aitozi
+1 from my side, since we rely on this feature to implement the real state ttl . -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

improper use of releaseJob() without a matching number of createTmpFiles() calls for jobId

2019-01-22 Thread Chang Liu
Dear community, I am having a problem releasing the job. 2019-01-22 10:42:50.098 WARN [Source: Custom Source -> Kafka -> ConstructTxSepa -> FilterOutFailures -> ObtainActualTxSepa -> TxSepaStream -> TxStream -> IndicatorsEvalStream -> TxEvalStream -> Sink: Print to Std. Out (2/4)]

RE: Kafka stream fed in batches throughout the day

2019-01-22 Thread Jonny Graham
Thanks. The plan is to use the DataStream API (or possibly Beam) which is fine for data from Kafka (and presumably from a file too). Assuming that I don't stop and restart the streaming job but just leave it idle (as no events are coming into Kafka), is there any issue with leaving a window

Re: Temporal tables not behaving as expected

2019-01-22 Thread Chris Miller
Hi Fabian, I was investigating this further today and was just coming to the same conclusion! Thanks for the confirmation, I'll make the suggested changes and see where that gets me. Originally I had assumed processing time for the rates table would be set when the rates are first produced by

Re: [SURVEY] Custom RocksDB branch

2019-01-22 Thread Stefan Richter
+1 from me as well. > On 22. Jan 2019, at 10:47, aitozi wrote: > > +1 from my side, since we rely on this feature to implement the real state > ttl . > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Temporal tables not behaving as expected

2019-01-22 Thread Fabian Hueske
Hi, The problem is that you are using processing time which is non-deterministic. Both inputs are consumed at the same time and joined based on which record arrived first. The result depends on a race condition. If you change the input table to have event time attributes and use these to

Re: Kafka stream fed in batches throughout the day

2019-01-22 Thread Fabian Hueske
Hi Jonny, I think this is good use case for event time stream processing. The idea of taking a savepoint, stopping and later resuming the job is good as it frees the resources that would otherwise be occupied by the idle job. In that sense it would behave like a batch job. However, in contrast

Re: improper use of releaseJob() without a matching number of createTmpFiles() calls for jobId

2019-01-22 Thread Chang Liu
Ok, I think I found where is the issue, but I don’t understand why. I have a method: def evaluationStream( indicatorsStream: DataStream[Indicators], scenarios: Set[Scenario]): DataStream[Evaluation] = indicatorsStream.map { indicators => Evaluation(indicators.id

Re: improper use of releaseJob() without a matching number of createTmpFiles() calls for jobId

2019-01-22 Thread Chang Liu
I have tried another way, it is not working as well: def evaluationStream( indicatorsStream: DataStream[Indicators], scenarios: Set[Scenario]): DataStream[Evaluation] = indicatorsStream.map(new IndicatorsToTxEval(scenarios)) class IndicatorsToTxEval( scenarios: Set[Scenario])

Re: Query on retract stream

2019-01-22 Thread Gagan Agrawal
Thanks Hequn for your response. I initially thought of trying out "over window" clause, however as per documentation there seems to be limitation in "orderBy" clause where it allows only single time event/processing time attribute. Whereas in my case events are getting generated from mysql bin log

Re: JSON to CEP coversion

2019-01-22 Thread ashish pok
Awesome. Let me look into it. Thanks a lot! - Ashish On Tuesday, January 22, 2019, 3:31 PM, Dominik Wosiński wrote: Hey Anish,  I have done some abstraction over the logic of CEP, but with the use of Apache Bahir[1], which introduces SIddhi CEP[2][ engine that allows SQL like definitions

Re: JSON to CEP coversion

2019-01-22 Thread Dominik Wosiński
Hey Anish, I have done some abstraction over the logic of CEP, but with the use of Apache Bahir[1], which introduces SIddhi CEP[2][ engine that allows SQL like definitions of the logic. Best, Dom. [1] https://github.com/apache/bahir [2] https://github.com/wso2/siddhi wt., 22 sty 2019 o 20:20

Re: Flink Jdbc streaming source support in 1.7.1 or in future?

2019-01-22 Thread Zhenghua Gao
Actually flink-connectors/flink-jdbc module provided a JDBCInputFormat to read data from a database. u can have a try. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Is there a way to get all flink build-in SQL functions

2019-01-22 Thread yinhua.dai
I would like to put this list to the our self service flink SQL web UI. Thanks. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: improper use of releaseJob() without a matching number of createTmpFiles() calls for jobId

2019-01-22 Thread Stefan Richter
Hi, Which version of Flink are you using? This issue https://issues.apache.org/jira/browse/FLINK-10283 shows that a similar problem was fixed in 1.6.1 and 1.7. If you use a newer version and still encounter the problem, you can reopen the

No resource available error while testing HA

2019-01-22 Thread Averell
Hello everyone, I am testing High Availability of Flink on YARN on an AWS EMR cluster. My configuration is an EMR with one master-node and 3 core-nodes (each with 16 vCores). Zookeeper is running on all nodes. Yarn session was created with: flink-yarn-session -n 2 -s 8 -jm 1024m -tm 20g A job

Re: Flink Jdbc streaming source support in 1.7.1 or in future?

2019-01-22 Thread Manjusha Vuyyuru
But 'JDBCInputFormat' will exit once its done reading all data.I need something like which keeps polling to mysql and fetch if there are any updates or changes. Thanks, manju On Wed, Jan 23, 2019 at 7:10 AM Zhenghua Gao wrote: > Actually flink-connectors/flink-jdbc module provided a

Re: improper use of releaseJob() without a matching number of createTmpFiles() calls for jobId

2019-01-22 Thread Chang Liu
Hi Stefan, Thanks. I am using 1.6.0. I will upgrade to 1.6.1 and see whether the problem remains. Best regards/祝好, Chang Liu 刘畅 > On 22 Jan 2019, at 16:10, Stefan Richter wrote: > > Hi, > > Which version of Flink are you using? This issue >

Re: improper use of releaseJob() without a matching number of createTmpFiles() calls for jobId

2019-01-22 Thread Chang Liu
Hi Stefan, I have upgraded to 1.6.1. I saw the warnings are gone but my issue remains: the scenarios: Set[Scenario] cannot be passed as a method argument in order to be used in the map function. But it is working if I just directly use the object variable scenarios: Set[Scenario] instead of

JSON to CEP coversion

2019-01-22 Thread ashish pok
All, Wondering if anyone in community has started something along the line - idea being CEP logic is abstracted out to metadata instead. That can then further be exposed out to users from a REST API/UI etc. Obviously, it would really need some additional information like data catalog etc for it

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-22 Thread Aaron Levin
Hey, Thanks so much for the help! This is awesome. I'll start looking into all of this right away and report back. Best, Aaron Levin On Mon, Jan 21, 2019 at 5:16 PM Ufuk Celebi wrote: > Hey Aaron, > > sorry for the late reply. > > (1) I think I was able to reproduce this issue using

Flink CEP : Doesn't generate output

2019-01-22 Thread dhanuka ranasinghe
Hi All, I have used Flink CEP to filter some events and generate some alerts based on certain conditions. But unfortunately doesn't print any result. I have attached source code herewith, could you please help me on this. package

GlobalWindow with custom tigger doesn't work correctly

2019-01-22 Thread Daniel Krenn
Hello people! I have a DataStream, which has events with with a continuing number which signifies their belonging to a production cycle. In essence, this is what the data looks like: value, production cycle 12.0, 2000 12.3, 2000 one production cylce 12.2, 2000 0.0, 2001 0.4, 2002 another

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-22 Thread Aaron Levin
Hey Ufuk, So, I looked into this a little bit: 1. clarification: my issues are with the hadoop-related snappy libraries and not libsnappy itself (this is my bad for not being clearer, sorry!). I already have `libsnappy` on my classpath, but I am looking into including the hadoop snappy

Change Flink checkpoint configuration at runtime

2019-01-22 Thread knur
I'm running a streaming job that uses the following config: checkpointInterval = 5 mins minPauseBetweenCheckpoints = 2 mins checkpointTimeout = 1 minute maxConcurrentCheckpoints = 1 This is using incremental, async checkpoints with the RocksDb backend. So far around 2K