Re: A use-case for Flink and reactive systems

2018-07-04 Thread Jörn Franke
I think it is a little bit overkill to use Flink for such a simple system. > On 4. Jul 2018, at 18:55, Yersinia Ruckeri wrote: > > Hi all, > > I am working on a prototype which should include Flink in a reactive systems > software. The easiest use-case with a traditional bank system where I

Re: A use-case for Flink and reactive systems

2018-07-04 Thread Mich Talebzadeh
Looks interesting. As I understand you have a microservice based on ingestion where a topic is defined for streaming messages that include transactional data. These transactions should already exist in your DB. For now we look at DB as part of your microservices and we take a logical view of it.

Re: [DISCUSS] Flink 1.6 features

2018-07-04 Thread Stephan Ewen
Hi all! A late follow-up with some thoughts: In principle, all these are good suggestions and are on the roadmap. We are trying to make the release "by time", meaning for it at a certain date (roughly in two weeks) and take what features are ready into the release. Looking at the status of the

A use-case for Flink and reactive systems

2018-07-04 Thread Yersinia Ruckeri
Hi all, I am working on a prototype which should include Flink in a reactive systems software. The easiest use-case with a traditional bank system where I have one microservice for transactions and another one for account/balances. Both are connected with Kafka. Transactions record a transaction

Re: How to set global config in the rich functions

2018-07-04 Thread Hequn Cheng
Hi zhen, Global configs can not be passed like this. You can set the global configs through ExecutionConfig, more details here[1]. [1] https://ci.apache.org/projects/flink/flink-docs-master/dev/best_practices.html#register-the-parameters-globally On Wed, Jul 4, 2018 at 8:27 PM, zhen li wrote:

Re: How to partition within same physical node in Flink

2018-07-04 Thread Ashish Pokharel
Thanks - I will wait for Stefan’s comments before I start digging in. > On Jul 4, 2018, at 4:24 AM, Fabian Hueske wrote: > > Hi Ashish, > > I think we don't want to make it an official public API (at least not at this > point), but maybe you can dig into the internal API and leverage it for

Re: [Table API/SQL] Finding missing spots to resolve why 'no watermark' is presented in Flink UI

2018-07-04 Thread Jungtaek Lim
Thanks Fabian, filed FLINK-9742 [1]. I'll submit a PR for FLINK-8094 via providing my TimestampExtractor. The implementation is also described as FLINK-9742. I'll start with current implementation which just leverages automatic cast from STRING to SQL_TIMESTAMP, but we could improve it from PR.

Re: [Table API/SQL] Finding missing spots to resolve why 'no watermark' is presented in Flink UI

2018-07-04 Thread Fabian Hueske
Hi, Glad you could get it to work! That's great :-) Regarding you comments: 1) Yes, I think we should make resultType() public. Please open a Jira issue and describe your use case. Btw. would you like to contribute your TimestampExtractor to Flink (or even a more generic one that allows to

Re: [Table API/SQL] Finding missing spots to resolve why 'no watermark' is presented in Flink UI

2018-07-04 Thread Jungtaek Lim
Thanks again Fabian for providing nice suggestion! Finally I got it working with applying your suggestion. Couple of tricks was needed: 1. I had to apply a hack (create new TimestampExtractor class to package org.apache.flink.blabla...) since Expression.resultType is defined as "package private"

Handling back pressure in Flink.

2018-07-04 Thread Mich Talebzadeh
Hi, In spark one can handle back pressure by setting the spark conf parameter: sparkConf.set("spark.streaming.backpressure.enabled","true") With backpressure you make Spark Streaming application stable, i.e. receives data only as fast as it can process it. In general one needs to ensure that

How to set global config in the rich functions

2018-07-04 Thread zhen li
Hi: I want to set some configs in the source functions use getRuntimeContext().getExecutionConfig().setGlobalJobParameters(parameterTool) And used the configs in the downstream operators such as filter function through the getGlobalJobParameters, But it returns null pointer exception.Is

Re: Let BucketingSink roll file on each checkpoint

2018-07-04 Thread Fabian Hueske
Hi Xilang, I thought about this again. The bucketing sink would need to roll on event-time intervals (similar to the current processing time rolling) which are triggered by watermarks in order to support consistency. However, it would also need to maintain a write ahead log of all received rows

Re: CoreOptions.TMP_DIRS bug

2018-07-04 Thread Gary Yao
Hi Oleksandr, I think your conclusions are correct. Thank you for looking into it. You can open a JIRA ticket describing the issue. Best, Gary On Wed, Jul 4, 2018 at 9:30 AM, Oleksandr Nitavskyi wrote: > Hello guys, > > We have discovered minor issue with Flink 1.5 on YARN particularly which

Re: [Table API/SQL] Finding missing spots to resolve why 'no watermark' is presented in Flink UI

2018-07-04 Thread Fabian Hueske
Hi Jungtaek, If it is "only" about the missing support to parse a string as timestamp, you could also implement a custom TimestampExtractor that works similar to the ExistingField extractor [1]. You would need to adjust a few things and use the expression "Cast(Cast('tsString,

Re: How to implement Multi-tenancy in Flink

2018-07-04 Thread Chesnay Schepler
Would it be feasible for you to partition your tenants across jobs, like for example 100 customers per job? On 04.07.2018 12:53, Ahmad Hassan wrote: Hi Fabian, One job per tenant model soon becomes hard to maintain. For example 1000 tenants would require 1000 Flink and providing HA and

Re: How to implement Multi-tenancy in Flink

2018-07-04 Thread Ahmad Hassan
Hi Fabian, One job per tenant model soon becomes hard to maintain. For example 1000 tenants would require 1000 Flink and providing HA and resilience for 1000 jobs is not so trivial solution. This is why we are hoping to get single flink job handling all the tenants through keyby tenant. However

Re: How to implement Multi-tenancy in Flink

2018-07-04 Thread Fabian Hueske
Hi Ahmad, Some tricks that might help to bring down the effort per tenant if you run one job per tenant (or key per tenant): - Pre-aggregate records in a 5 minute Tumbling window. However, pre-aggregation does not work for FoldFunctions. - Implement the window as a custom ProcessFunction that

Re: Flink memory management in table api

2018-07-04 Thread Fabian Hueske
State is maintained in the configured state backend [1]. Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/state_backends.html 2018-07-04 11:38 GMT+02:00 Amol S - iProgrammer : > Hello fabian, > > Thanks for your quick response, > > According to above

Re: Let BucketingSink roll file on each checkpoint

2018-07-04 Thread XilangYan
Hi Fabian, We did need a consistent view of data, we need the Counter and HDFS file to be consistent. For example, when the Counter indicate there is 1000 message wrote to the HDFS, there must be exactly 1000 messages in HDFS ready for read. The data we write to HDFS is collected by an

Re: [Table API/SQL] Finding missing spots to resolve why 'no watermark' is presented in Flink UI

2018-07-04 Thread Jungtaek Lim
Thanks Chesnay! Great news to hear. I'll try out with latest master branch. Thanks Fabian for providing the docs! I guess I already tried out with KafkaJsonTableSource and failed back to custom TableSource since the type of rowtime field is string unfortunately, and I needed to parse and map to

Re: run time error java.lang.NoClassDefFoundError: org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer09

2018-07-04 Thread Mich Talebzadeh
yes indeed thanks. It is all working fine. But only writing to a text file. I want to emulate what I do with Flink as I do with Spark streaming writing high value events to Hbase on HDFS. Dr Mich Talebzadeh LinkedIn *

Re: Checkpointing in Flink 1.5.0

2018-07-04 Thread Chesnay Schepler
Reference: https://issues.apache.org/jira/browse/FLINK-9739 On 04.07.2018 10:46, Chesnay Schepler wrote: It's not really path-parsing logic, but path handling i suppose; see RocksDBStateBackend#setDbStoragePaths(). I went ahead and converted said method into a simple test method, maybe this

Re: [Table API/SQL] Finding missing spots to resolve why 'no watermark' is presented in Flink UI

2018-07-04 Thread Fabian Hueske
Hi Jungtaek, Flink 1.5.0 features a TableSource for Kafka and JSON [1], incl. timestamp & watemark generation [2]. It would be great if you could let us know, if that addresses your use case and if not what's missing or not working. So far Table API / SQL does not have support for late-data side

Re: Checkpointing in Flink 1.5.0

2018-07-04 Thread Chesnay Schepler
It's not really path-parsing logic, but path handling i suppose; see RocksDBStateBackend#setDbStoragePaths(). I went ahead and converted said method into a simple test method, maybe this is enough to debug the issue. I assume this regression was caused by FLINK-6557, which refactored the

Re: Flink memory management in table api

2018-07-04 Thread Fabian Hueske
Hi Amol, The memory consumption depends on the query/operation that you are doing. Time-based operations like group-window-aggregations, over-window-aggregations, or window-joins can automatically clean up their state once data is not no longer needed. Operations such as non-windowed aggregations

Re: [Table API/SQL] Finding missing spots to resolve why 'no watermark' is presented in Flink UI

2018-07-04 Thread Chesnay Schepler
The watermark display in the UI is bugged in 1.5.0. It is fixed on master and the release-1.5 branch, and will be included in 1.5.1 that is slated to be released next week. On 04.07.2018 10:22, Jungtaek Lim wrote: Sorry I forgot to mention the version: Flink 1.5.0, and I ran the app in

Re: Why should we use an evictor operator in flink window

2018-07-04 Thread Fabian Hueske
Hi, The Evictor is useful if you want to remove some elements from the window state but not all. This also implies that a window is evaluated multiple times because otherwise you could just filter in the the user function (as you suggested) and purge the whole window afterwards. Evictors are

Re: Description of Flink event time processing

2018-07-04 Thread Fabian Hueske
Hi Elias, I agree, the docs lack a coherent discussion of event time features. Thank you for this write up! I just skimmed your document and will provide more detailed feedback later. It would be great to add such a page to the documentation. Best, Fabian 2018-07-03 3:07 GMT+02:00 Elias Levy :

Re: How to partition within same physical node in Flink

2018-07-04 Thread Fabian Hueske
Hi Ashish, I think we don't want to make it an official public API (at least not at this point), but maybe you can dig into the internal API and leverage it for your use case. I'm not 100% sure about all the implications, that's why I pulled in Stefan in this thread. Best, Fabian 2018-07-02

Re: [Table API/SQL] Finding missing spots to resolve why 'no watermark' is presented in Flink UI

2018-07-04 Thread Jungtaek Lim
Sorry I forgot to mention the version: Flink 1.5.0, and I ran the app in IntelliJ, not tried from cluster. 2018년 7월 4일 (수) 오후 5:15, Jungtaek Lim 님이 작성: > Hi Flink users, > > I'm new to Flink and trying to evaluate couple of streaming frameworks via > implementing same apps. > > While

Re: run time error java.lang.NoClassDefFoundError: org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer09

2018-07-04 Thread Fabian Hueske
Looking at the other threads, I assume you solved this issue. The problem should have been that FlinkKafka09Consumer is not included in the flink-connector-kafka-0.11 module, because it is the connector for Kafka 0.9 and not Kafka 0.11. Best, Fabian 2018-07-02 11:20 GMT+02:00 Mich Talebzadeh :

Re: Passing type information to JDBCAppendTableSink

2018-07-04 Thread Fabian Hueske
There is also the SQL:2003 MERGE statement that can be used to implement UPSERT logic. It is a bit verbose but supported by Derby [1]. Best, Fabian [1] https://issues.apache.org/jira/browse/DERBY-3155 2018-07-04 10:10 GMT+02:00 Fabian Hueske : > Hi Chris, > > MySQL (and maybe other DBMS as

[Table API/SQL] Finding missing spots to resolve why 'no watermark' is presented in Flink UI

2018-07-04 Thread Jungtaek Lim
Hi Flink users, I'm new to Flink and trying to evaluate couple of streaming frameworks via implementing same apps. While implementing apps with both Table API and SQL, I found there's 'no watermark' presented in Flink UI, whereas I had been struggling to apply row time attribute. For example,

CoreOptions.TMP_DIRS bug

2018-07-04 Thread Oleksandr Nitavskyi
Hello guys, We have discovered minor issue with Flink 1.5 on YARN particularly which was related with the way Flink manages temp paths (io.tmp.dirs ) in configuration: https://ci.apache.org/projects/flink/flink-docs-master/ops/config.html#io-tmp-dirs 1. From what we can see in the code,