Re: [Flink 1.7.0] initial failures with starting high-parallelism job without checkpoint/savepoint

2019-01-24 Thread Steven Wu
Hi Andrey, Weird that I didn't see your reply in my email inbox. My colleague happened to see it in apache archive :) nope, we didn't experience it with 1.4 (previous version) Yes, we did use HA setup. high-availability: zookeeper high-availability.zookeeper.quorum: ...

Re: [DISCUSS] Towards a leaner flink-dist

2019-01-24 Thread Jark Wu
+1 for the leaner distribution and improve the "Download" page. On Fri, 25 Jan 2019 at 01:54, Bowen Li wrote: > +1 for leaner distribution and a better 'download' webpage. > > +1 for a full distribution if we can automate it besides supporting the > leaner one. If we support both, I'd image

Re: [DISCUSS] Towards a leaner flink-dist

2019-01-24 Thread jincheng sun
Hi Chesnay, Thank you for the proposal. And i like it very much. +1 for the leaner distribution. About improve the "Download" page, I think we can add the connectors download link in the "Optional components" section which @Timo Walther mentioned above. Regards, Jincheng Chesnay Schepler

Re: TimeZone shift problem in Flink SQL

2019-01-24 Thread Bowen Li
Hi, Did you consider timezone in conversion in your UDF? On Tue, Jan 22, 2019 at 5:29 AM 徐涛 wrote: > Hi Experts, > I have the following two UDFs, > unix_timestamp: transform from string to Timestamp, with the > arguments (value:String, format:String), return Timestamp >

Re: [Flink 1.6] How to get current total number of processed events

2019-01-24 Thread Congxian Qiu
Hi, Nhan Do you want the total number of the current parallelism or the operator? If you want the total number of the current parallelism, Is the operator state[1] satisfied with your use case? [1]

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-24 Thread Guowei Ma
This may be caused by a jvm process can only load a so once.So a triky way is to rename it。 发自我的 iPhone > 在 2019年1月25日,上午7:12,Aaron Levin 写道: > > Hi Ufuk, > > Update: I've pinned down the issue. It's multiple classloaders loading > `libhadoop.so`: > > ``` > failed to load native hadoop

Re: Trouble migrating state from 1.6.3 to 1.7.1

2019-01-24 Thread Tzu-Li (Gordon) Tai
Hi! We've double checked the code, and the only plausible cause of this is that you may be using flink-avro 1.6.x with Flink 1.7.x. Could you double check that all Flink dependencies, including flink-avro, are 1.7.1? You can verify this by doing `mvn dependency:tree` on your job, and check that

Re: Is there a way to get all flink build-in SQL functions

2019-01-24 Thread Hequn Cheng
Hi yinhua, As Chesnay suggest, document is a good way. You can find descriptions and example for each udf. If you only want to get a list of name, you can also take a look at the flink code(i.e., the BasicOperatorTable.builtInSqlOperators

Re: Change Flink checkpoint configuration at runtime

2019-01-24 Thread Chesnay Schepler
You cannot change the checkpointing configuration at runtime. You should be able to resume the job from the last checkpoint. On 22.01.2019 19:39, knur wrote: I'm running a streaming job that uses the following config: checkpointInterval = 5 mins minPauseBetweenCheckpoints = 2 mins

Re: Back pressure within a operator chain

2019-01-24 Thread Chesnay Schepler
The behavior should be identical regardless of whether the are chained or not. On 23.01.2019 09:11, Paul Lam wrote: Hi, I would like to know if back pressure applies to operators in the same operator chain? The background is that I have a simple streaming job that consumes data from

Re: Is there a way to get all flink build-in SQL functions

2019-01-24 Thread Chesnay Schepler
Beyond the documentation I don't believe there to be a mechanism for listing all built-in functions. On 23.01.2019 04:30, yinhua.dai wrote: I would like to put this list to the our self service flink SQL web UI.

Re: Flink CEP : Doesn't generate output

2019-01-24 Thread Chesnay Schepler
Can you provide us a self-contained reproducing example? (preferably as elementary as possible) On 22.01.2019 18:58, dhanuka ranasinghe wrote: Hi All, I have used Flink CEP to filter some events and generate some alerts based on certain conditions. But unfortunately doesn't print any

Re: No resource available error while testing HA

2019-01-24 Thread Gary Yao
Hi Averell, > Then I have another question: when JM cannot start/connect to the JM on .88, > why didn't it try on .82 where resource are still available? When you are deploying on YARN, the TM container placement is decided by the YARN scheduler and not by Flink. Without seeing the complete

Re: Trouble migrating state from 1.6.3 to 1.7.1

2019-01-24 Thread pwestermann
I ran `mvn dependency:tree` and only see 1.7.1 dependencies for Flink: [INFO] com.inin.analytics:analytics-flink:jar:0.0.1-SNAPSHOT [INFO] +- org.apache.flink:flink-streaming-java_2.11:jar:1.7.1:provided [INFO] | +- org.apache.flink:flink-runtime_2.11:jar:1.7.1:provided [INFO] | | +-

Re: [Flink 1.6] How to get current total number of processed events

2019-01-24 Thread Kien Truong
Hi Nhan, You can store the max/min value using the value states of a KeyedProcessFunction, or in the global state of a ProcessWindowFunction. On processing each item, compare its value to the current max/min and update the stored value as needed. Regards, Kien On 1/24/2019 12:37 AM,

Re: Use case for The Broadcast State Pattern and efficient database access

2019-01-24 Thread Andrey Zagrebin
Hi Marke, Q1: From your description of the problem, "Broadcast State Pattern" seems to be the suitable choice. If you want to keep the same state on all parallel instances which process stream[1] and update/store that state the same way on each instance by using each element of stream[2]. Q2:

Re: [Flink 1.7.0] initial failures with starting high-parallelism job without checkpoint/savepoint

2019-01-24 Thread Andrey Zagrebin
Hi Steven, Did you not experience this problem with previous Flink release (your marked topic with 1.7)? Do you use HA setup? Without HA setup, the blob data, which belongs to the job, will be distributed from job master node to all task executors. Depending on the size of the blob data (jars,

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-24 Thread Aaron Levin
Hi Ufuk, I'm starting to believe the bug is much deeper than the originally reported error because putting the libraries in `/usr/lib` or `/lib` does not work. This morning I dug into why putting `libhadoop.so` into `/usr/lib` didn't work, despite that being in the `java.library.path` at the call

Re: [DISCUSS] Towards a leaner flink-dist

2019-01-24 Thread Bowen Li
+1 for leaner distribution and a better 'download' webpage. +1 for a full distribution if we can automate it besides supporting the leaner one. If we support both, I'd image release managers should be able to package two distributions with a single change of parameter instead of manually package

Re: Dump snapshot of big table in real time using StreamingFileSink

2019-01-24 Thread knur
Bump?  -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Adding flink udf support to linkedin's portable udf framework transport

2019-01-24 Thread Arup Malakar
Hi Flink Users, Came across the project transport from linkedin: https://github.com/linkedin/transport I think the project has great potential which allows for sharing udf implementation across various compute engines (hive/spark/presto). Any thoughts on adding support for flink udfs to transport