Re: Slide Window Compute Optimization

2018-07-05 Thread YennieChen88
Hi Kostas and Rong, Thank you for your reply. As both of you ask for more info about my use case, I now reply in unison. My case is used for counting the number of successful login and failures within one hour, keyBy other login related attributes (e.g. ip, device, login type ...).

回复:Handling back pressure in Flink.

2018-07-05 Thread Zhijiang(wangzhijiang999)
Hi Mich, From flink-1.5.0 the network flow control is improved by credit-based mechanism whichs handles backpressure better than before. The producer sends data based on the number of available buffers(credit) onconsumer side. If processing time on consumer side is slower than producing time

回复:Limiting in flight data

2018-07-05 Thread Zhijiang(wangzhijiang999)
Hi Vishal, Before Flink-1.5.0, the sender tries best to send data on the network until the wire is filled with data. From Flink-1.5.0 the network flow control is improved by credit-based idea. That means the sender transfers data based on how many buffers avaiable on receiver side, so there

Re: Description of Flink event time processing

2018-07-05 Thread Rong Rong
Hi Elias, Thanks for putting together the document. This is actually a very good, well-rounded document. I think you did not to enable access for comments for the link. Would you mind enabling comments for the google doc? Thanks, Rong On Thu, Jul 5, 2018 at 8:39 AM Fabian Hueske wrote: > Hi

Re: Slide Window Compute Optimization

2018-07-05 Thread Rong Rong
Hi Yennie, AFAIK, the sliding window will in fact duplicate elements into multiple different streams. There's a discussion thread regarding this [1]. We are looking into some performance improvement, can you provide some more info regarding your use case? -- Rong [1]

Re: Passing type information to JDBCAppendTableSink

2018-07-05 Thread Rong Rong
+1 to this answer. MERGE is what I found most compatible syntax when dealing with upsert / replace. AFAIK, almost all DBMS have some kind of dialect regrading upsert functionality, so following the SQL standard might be your best solution here. And yes both the MERGE ingestion SQL and the

Re: Flink Kafka TimeoutException

2018-07-05 Thread ashish pok
Our experience on this has been that if Kafka cluster is healthy, JVM resource contentions on our Flink app caused by high heap utilization and there by lost CPU cycles on GC also did result in this issue. Getting basic JVM metrics like CPU load, GC times and Heap Util from your app (we use

Re: A use-case for Flink and reactive systems

2018-07-05 Thread Yersinia Ruckeri
I guess my initial bad explanation caused confusion. After reading again docs I got your points. I can use Flink for online streaming processing, letting it to manage the state, which can be persisted in a DB asynchronously to ensure savepoints and using queryable state to make the current state

Re: Flink Kafka TimeoutException

2018-07-05 Thread Ted Yu
Have you tried increasing the request.timeout.ms parameter (Kafka) ? Which Flink / Kafka release are you using ? Cheers On Thu, Jul 5, 2018 at 5:39 AM Amol S - iProgrammer wrote: > Hello, > > I am using flink with kafka and getting below exception. > >

Re: Description of Flink event time processing

2018-07-05 Thread Fabian Hueske
Hi Elias, Thanks for the great document! I made a pass over it and left a few comments. I think we should definitely add this to the documentation. Thanks, Fabian 2018-07-04 10:30 GMT+02:00 Fabian Hueske : > Hi Elias, > > I agree, the docs lack a coherent discussion of event time features. >

Re: 1.5.1

2018-07-05 Thread Chesnay Schepler
Release notes: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12343053 I'm currently building the release artifacts, if everything goes smoothly it should be released next week. On 05.07.2018 16:16, Vishal Santoshi wrote: We are planning to go to 1.5.0 next week (

Limiting in flight data

2018-07-05 Thread Vishal Santoshi
"Yes, Flink 1.5.0 will come with better tools to handle this problem. Namely you will be able to limit the “in flight” data, by controlling the number of assigned credits per channel/input gate. Even without any configuring Flink 1.5.0 will out of the box buffer less data, thus mitigating the

1.5.1

2018-07-05 Thread Vishal Santoshi
We are planning to go to 1.5.0 next week ( from 1.4.0 ). I could not get a list of jira issues that would be addressed in the 1.5.1 release which seems imminent. Could anyone inform the forum of the 1.5.1 bug fixes and tell us a time line for the 1.5.1 release ? Thanks much

Re: Dynamic Rule Evaluation in Flink

2018-07-05 Thread Aarti Gupta
Thanks everyone, will take a look. --Aarti On Thu, Jul 5, 2018 at 6:37 PM, Fabian Hueske wrote: > Hi, > > > Flink doesn't support connecting multiple streams with heterogeneous > schema > > This is not correct. > Flink is very well able to connect streams with different schema. However, > you

Re: Dynamic Rule Evaluation in Flink

2018-07-05 Thread Aarti Gupta
+Ken. --Aarti On Thu, Jul 5, 2018 at 6:48 PM, Aarti Gupta wrote: > Thanks everyone, will take a look. > > --Aarti > > On Thu, Jul 5, 2018 at 6:37 PM, Fabian Hueske wrote: > >> Hi, >> >> > Flink doesn't support connecting multiple streams with heterogeneous >> schema >> >> This is not correct.

Re: A use-case for Flink and reactive systems

2018-07-05 Thread Mich Talebzadeh
Hi, What you are saying is I can either use Flink and forget database layer, or make a java microservice with a database. Mixing Flink with a Database doesn't make any sense. I would have thought that moving with microservices concept, Flink handling streaming data from the upstream microservice

Re: Dynamic Rule Evaluation in Flink

2018-07-05 Thread Fabian Hueske
Hi, > Flink doesn't support connecting multiple streams with heterogeneous schema This is not correct. Flink is very well able to connect streams with different schema. However, you cannot union two streams with different schema. In order to reconfigure an operator with changing rules, you can

Re: A use-case for Flink and reactive systems

2018-07-05 Thread Mich Talebzadeh
Hi Fabian, On your point below … Basically, you are moving the database into the streaming application. This assumes a finite size for the data in the streaming application to persist. In terms of capacity planning how this works? Some applications like Fraud try to address this by deploying

Re: Release Notes - Flink 1.6

2018-07-05 Thread Chesnay Schepler
That's because 1.6 isn't released yet. It's scheduled to be released at the end of July. The latest stable version 1.5.0. On 05.07.2018 14:49, Puneet Kinra wrote: Hi Not able to see 1.6 flink release notes. -- *Cheers * * * *Puneet Kinra* * * *Mobile:+918800167808 | Skype :

Release Notes - Flink 1.6

2018-07-05 Thread Puneet Kinra
Hi Not able to see 1.6 flink release notes. -- *Cheers * *Puneet Kinra* *Mobile:+918800167808 | Skype : puneet.ki...@customercentria.com * *e-mail :puneet.ki...@customercentria.com *

Batch Processing

2018-07-05 Thread Gaurav Sehgal
Hello, I am looking for batch processing framework which will read data in batches from MongoDb and enrich it using another data source and then upload them in ElasticSearch, is Flink a good framework for such a use case. Regards, Gaurav

Re: Dynamic Rule Evaluation in Flink

2018-07-05 Thread Puneet Kinra
Hi Aarti Flink doesn't support connecting multiple streams with heterogeneous schema ,you can try the below solution a) If stream A is sending some events make the output of that as String/JsonString. b) If stream B is sending some events make the output of that as String/JsonString. c) Now

Re: Facing issue in RichSinkFunction

2018-07-05 Thread Hequn Cheng
Hi Amol, The implementation of the RichSinkFunction probably contains a field that is not serializable. To avoid serializable exception, you can: 1. Marking the field as transient. This makes the serialization mechanism skip the field. 2. If the field is part of the object's persistent state, the

Re: A use-case for Flink and reactive systems

2018-07-05 Thread Yersinia Ruckeri
It helps, at least it's fairly clear now. I am not against storing the state into Flink, but as per your first point, I need to get it persisted, asynchronously, in an external database too to let other possible application/services to retrieve the state. What you are saying is I can either use

Re: [Table API/SQL] Finding missing spots to resolve why 'no watermark' is presented in Flink UI

2018-07-05 Thread Fabian Hueske
Hi, Thanks for the PR! I'll have a look at it later today. The problem of the retraction stream conversion is probably that the return type is a Tuple2[Boolean, Row]. The boolean flag indicates whether the row is added or retracted. Best, Fabian 2018-07-04 15:38 GMT+02:00 Jungtaek Lim : >

Re: Slide Window Compute Optimization

2018-07-05 Thread Kostas Kloudas
Hi, You are correct that with sliding windows you will have 3600 “open windows” at any point. Could you describe a bit more what you want to do? If you simply want to have an update of something like a counter every second, then you can implement your own logic with a ProcessFunction that

Re: How to implement Multi-tenancy in Flink

2018-07-05 Thread Ahmad Hassan
HI Chesnay, Yes this is something we would eventually be doing and then maintaining the configuration of which tenants are mapped to which flink jobs. This would reduce the number of flinks jobs to maintain in order to support 1000s of tenants in our use case . Thanks. On Wed, 4 Jul 2018 at

Re: A use-case for Flink and reactive systems

2018-07-05 Thread Fabian Hueske
Hi Yersinia, The main idea of an event-driven application is to hold the state (i.e., the account data) in the streaming application and not in an external database like Couchbase. This design is very scalable (state is partitioned) and avoids look-ups from the external database because all state

Slide Window Compute Optimization

2018-07-05 Thread YennieChen88
Hi, I want to use slide windows of 1 hour window size and 1 second step size. I found that once a element arrives, it will be processed in 3600 windows serially through one thread. It takes serveral seconds to finish one element processing,much more than my expection. Do I have any way to

Re: [DISCUSS] Flink 1.6 features

2018-07-05 Thread Vishal Santoshi
I am not sure whether this is in any roadmap and as someone suggested wishes are free...Tensorflow on flink though ambitious should be a big win. I am not sure how operator isolation for a hybrid GPU/CPU would be achieved and how repetitive execution could be natively supported by flink but it

Re: A use-case for Flink and reactive systems

2018-07-05 Thread Yersinia Ruckeri
My idea started from here: https://flink.apache.org/usecases.html First use case describes what I am trying to realise ( https://flink.apache.org/img/usecases-eventdrivenapps.png) My application is Flink, listening to incoming events, changing the state of an object (really an aggregate here) and