Re: What is the right way to use the physical partitioning strategy in Data Streams?

2019-09-24 Thread Felipe Gutierrez
yes. It will be very welcome a discussion with who knows better than me. Basically, I am trying to implement the issue FLINK-1725 [1] that was gave up on March 2017. Stephan Ewen said that there are more issues to be fixed before going to this implementation and I don't really know which are

Re: Recommended approach to debug this

2019-09-24 Thread Debasish Ghosh
Well, I think I got the solution though I am not yet sure of the problem .. The original code looked like this .. Try { // from a parent class called Runner which runs a streamlet // run returns an abstraction which completes a Promise depending on whether // the Job was successful or not

HBaseTableSource for SQL query errors

2019-09-24 Thread ????????
I am using the HBaseTableSource class for SQL query errors.No error outside Flink using HBase demo. My flink version is 1.8.1,use flink table SQL API flink code show as below: // environment configuration ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

Re: Recommended approach to debug this

2019-09-24 Thread Biao Liu
The key point of this case is in `PackagedProgram#callMainMethod`. The `ProgramAbortException` is expected when executing the main method here. This `ProgramAbortException` thrown is wrapped with `InvocationTargetException` by Java reflection layer [1]. There is a piece of codes handling

Re: How to use thin JAR instead of fat JAR when submitting Flink job?

2019-09-24 Thread Fabian Hueske
Hi, To expand on Dian's answer. You should not add Flink's core libraries (APIs, core, runtime, etc.) to your fat JAR. However, connector dependencies (like Kafka, Cassandra, etc.) should be added. If all your jobs require the same dependencies, you can also add JAR files to the ./lib folder of

Re: Flink job manager doesn't remove stale checkmarks

2019-09-24 Thread Clay Teeter
Oh geez, checkmarks = checkpoints... sorry. What i mean by stale "checkpoints" are checkpoints that should be reaped by: "state.checkpoints.num-retained: 3". What is happening is that directories: - state.checkpoints.dir: file:///opt/ha/49/checkpoints - high-availability.storageDir:

Re: Question about reading ORC file in Flink

2019-09-24 Thread Fabian Hueske
Hi QiShu, It might be that Flink's OrcInputFormat has a bug. Can you open a Jira issue to report the problem? In order to be able to fix this, we need as much information as possible. It would be great if you could create a minimal example of an ORC file and a program that reproduces the issue.

Re: Kafka producer failed with InvalidTxnStateException when performing commit transaction

2019-09-24 Thread Tony Wei
Hi Becket, I have read kafka source code and found that the error won't be propagated to client if the list of topic-partition is empty [1], because it bind the error with each topic-partition. If this list is empty, then that error won't be packaged into response body. That made the client

Re: Approach to match join streams to create unique streams.

2019-09-24 Thread Fabian Hueske
Hi, AFAIK, Flink SQL Temporal table function joins are only supported as inner equality joins. An extension to left outer joins would be great, but is not on the immediate roadmap AFAIK. If you need the inverse, I'd recommend to implement the logic in a DataStream program with a

Re: Is there a lifecycle listener that gets notified when a topology starts/stops on a task manager

2019-09-24 Thread Stephen Connolly
I have created https://issues.apache.org/jira/browse/FLINK-14184 as a proposal to improve Flink in this specific area. On Tue, 24 Sep 2019 at 03:23, Zhu Zhu wrote: > Hi Stephen, > > I think disposing static components in the closing stage of a task is > required. > This is because your

Challenges Deploying Flink With Savepoints On Kubernetes

2019-09-24 Thread Sean Hester
hi all--we've run into a gap (knowledge? design? tbd?) for our use cases when deploying Flink jobs to start from savepoints using the job-cluster mode in Kubernetes. we're running a ~15 different jobs, all in job-cluster mode, using a mix of Flink 1.8.1 and 1.9.0, under GKE (Google Kubernetes

Re: Challenges Deploying Flink With Savepoints On Kubernetes

2019-09-24 Thread Yuval Itzchakov
AFAIK there's currently nothing implemented to solve this problem, but working on a possible fix can be implemented on top of https://github.com/lyft/flinkk8soperator which already has a pretty fancy state machine for rolling upgrades. I'd love to be involved as this is an issue I've been thinking

Re: Recommended approach to debug this

2019-09-24 Thread Zili Chen
Actually there is an ongoing client API refactoring on this stuff[1] and one of the main purpose is eliminating hijacking env.execute... Best, tison. [1] https://lists.apache.org/x/thread.html/ce99cba4a10b9dc40eb729d39910f315ae41d80ec74f09a356c73938@%3Cdev.flink.apache.org%3E Biao Liu

RE: NoClassDefFoundError in failing-restarting job that uses url classloader

2019-09-24 Thread Subramanyam Ramanathan
Hi, Thank you. I think the takeaway for us is that we need to make sure that the threads are stopped in the close() method. With regard to FLINK-10455, I see that the fix versions say : 1.5.6, 1.7.0, 1.7.3, 1.8.1, 1.9.0 However, I’m unable to find 1.7.3 in the downloads

Re: How do I create a temporal function using Flink Clinet SQL?

2019-09-24 Thread Fabian Hueske
Hi, It's not possible to create a temporal table function from SQL, but you can define it in the config.yaml of the SQL client as described in the documentation [1]. Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/sqlClient.html#temporal-tables Am Di.,

RE: NoClassDefFoundError in failing-restarting job that uses url classloader

2019-09-24 Thread Subramanyam Ramanathan
Hi Zhu, We also use FlinkKafkaProducer(011), hence I felt this fix would also be needed for us. I agree that the fix for the issue I had originally mentioned would not be fixed by this, but I felt that I should be consuming this fix also. Thanks, Subbu From: Zhu Zhu

Setting environment variables of the taskmanagers (yarn)

2019-09-24 Thread Richard Deurwaarder
Hello, We have our flink job (1.8.0) running on our hadoop 2.7 cluster with yarn. We would like to add the GCS connector to use GCS rather than HDFS. Following the documentation of the GCS connector[1] we have to specify which credentials we want to use and there are two ways of doing this: *

How do I create a temporal function using Flink Clinet SQL?

2019-09-24 Thread srikanth flink
Hi, I'm running time based joins, dynamic table over temporal function. Is there a way I could create temporal table using flink SQL. And I'm using v1.9. Thanks Srikanth

Re: Recommended approach to debug this

2019-09-24 Thread Biao Liu
So I believe (I did't test it) the solution for this case is keeping the original exception thrown from `env.execute()` and throwing this exception out of main method. It's a bit tricky, maybe we could have a better design of this scenario. Thanks, Biao /'bɪ.aʊ/ On Tue, 24 Sep 2019 at 18:55,

??????HBaseTableSource for SQL query errors

2019-09-24 Thread ????????
The following exception was thrown in the MiniCluster.executeJobBlocking method via the debug source code. akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/dispatcher#997865675]] after [1 ms]. Sender[null] sent message of type

Re: Have trouble on running flink

2019-09-24 Thread Biao Liu
Hi Russell, I don't think `BackendBuildingException` is root cause. In your case, this exception appears when task is under cancelling. Have you ever checked the log of yarn node manager? There should be an exit code of container. Even more the container is probably killed by yarn node manager.

Re: Recommended approach to debug this

2019-09-24 Thread Biao Liu
Hi Zili, Great to hear that! Hope to see the new client soon! Thanks, Biao /'bɪ.aʊ/ On Tue, 24 Sep 2019 at 19:23, Zili Chen wrote: > Actually there is an ongoing client API refactoring on this stuff[1] and > one of the main purpose is > eliminating hijacking env.execute... > > Best, >

Re: NoClassDefFoundError in failing-restarting job that uses url classloader

2019-09-24 Thread Zhu Zhu
Hi Subramanyam, I think you do not need the fix in FLINK-10455 which is for Kafka only. It's just a similar issue as you met. As you said, we need to make sure that the operator/UDF spawned threads are stopped in the close() method. In this way, we can avoid the thread to throw

Re: Approach to match join streams to create unique streams.

2019-09-24 Thread srikanth flink
Fabian, Thanks, already implemented the left join. Srikanth On Tue, Sep 24, 2019 at 2:12 PM Fabian Hueske wrote: > Hi, > > AFAIK, Flink SQL Temporal table function joins are only supported as inner > equality joins. > An extension to left outer joins would be great, but is not on the >

Re: Challenges Deploying Flink With Savepoints On Kubernetes

2019-09-24 Thread Hao Sun
We always make a savepoint before we shutdown the job-cluster. So the savepoint is always the latest. When we fix a bug or change the job graph, it can resume well. We only use checkpoints for unplanned downtime, e.g. K8S killed JM/TM, uncaught exception, etc. Maybe I do not understand your use

Re: Setting environment variables of the taskmanagers (yarn)

2019-09-24 Thread Peter Huang
Hi Richard, For the first question, I don't think you need to explicitly specify fs.hdfs.hadoopconf as each file in the ship folder is copied as a yarn local resource for containers. The configuration path is overridden internally in Flink. For the second question of setting TM environment

Re: Setting environment variables of the taskmanagers (yarn)

2019-09-24 Thread bupt_ljy
Hi Richard, You can use dynamic properties to add your environmental variables. Set jobmanager env: e.g. -Dcontainerized.master.env.GOOGLE_APPLICATION_CREDENTIALS=xyz Set taskmanager env: e.g. -Dcontainerized.taskmanager.env.GOOGLE_APPLICATION_CREDENTIALS=xyz Best Regards, Jiayi Liao

Re: NoClassDefFoundError in failing-restarting job that uses url classloader

2019-09-24 Thread Zhu Zhu
Hi Subramanyam, I checked the commits. There are 2 fixes in FLINK-10455, only release 1.8.1 and release 1.9.0 contain both of them. Thanks, Zhu Zhu Subramanyam Ramanathan 于2019年9月24日周二 下午11:02写道: > Hi Zhu, > > > > We also use FlinkKafkaProducer(011), hence I felt this fix would also be >

Re: Challenges Deploying Flink With Savepoints On Kubernetes

2019-09-24 Thread Yuval Itzchakov
Hi Hao, I think he's exactly talking about the usecase where the JM/TM restart and they come back up from the latest savepoint which might be stale by that time. On Tue, 24 Sep 2019, 19:24 Hao Sun, wrote: > We always make a savepoint before we shutdown the job-cluster. So the > savepoint is

Re: [SURVEY] How many people are using customized RestartStrategy(s)

2019-09-24 Thread Steven Wu
Zhu Zhu, Sorry, I was using different terminology. yes, Flink meter is what I was talking about regarding "fullRestarts" for threshold based alerting. On Mon, Sep 23, 2019 at 7:46 PM Zhu Zhu wrote: > Steven, > > In my mind, Flink counter only stores its accumulated count and reports > that

Flink Temporal Tables Usage

2019-09-24 Thread Nishant Gupta
Hi Team, I have slight confusion w.r.t usage of temporal tables. In documentation [1], it mentions that we need to use Lookuptables like HBaseTableSource and In documentation [2], while using SQLClient, there isn't anything mentioned about it. Do we need to use the same kind of LookUpTables in

Re: Challenges Deploying Flink With Savepoints On Kubernetes

2019-09-24 Thread Hao Sun
I think I overlooked it. Good point. I am using Redis to save the path to my savepoint, I might be able to set a TTL to avoid such issue. Hao Sun On Tue, Sep 24, 2019 at 9:54 AM Yuval Itzchakov wrote: > Hi Hao, > > I think he's exactly talking about the usecase where the JM/TM restart and >

Re: [SURVEY] How many people are using customized RestartStrategy(s)

2019-09-24 Thread Zhu Zhu
Hi Steven, As a conclusion, since we will have a meter metric[1] for restarts, customized restart strategy is not needed in your case. Is that right? [1] https://issues.apache.org/jira/browse/FLINK-14164 Thanks, Zhu Zhu Steven Wu 于2019年9月25日周三 上午2:30写道: > Zhu Zhu, > > Sorry, I was using

Re: NoClassDefFoundError in failing-restarting job that uses url classloader

2019-09-24 Thread Dian Fu
Hi Subramanyam, 1.7.3 is not released yet. You need cherrypick these fixes if they really need them. Regards, Dian > 在 2019年9月25日,上午12:08,Zhu Zhu 写道: > > Hi Subramanyam, > > I checked the commits. > There are 2 fixes in FLINK-10455, only release 1.8.1 and release 1.9.0 > contain both of

Re: Question about reading ORC file in Flink

2019-09-24 Thread 163
Hi Fabian, After debugging in local mode, I found that Flink orc connector is no problem, but some fields in our schema is in capital form,so these fields can not be matched. But the program directly read orc file using includeColumns method, which will use equalsIgnoreCase to match the

Re: 请教初始化系统缓存的问题

2019-09-24 Thread 高博
你好, 我这里提供几个思路,我们公司做车联网的,目前线上运行的程序,都需要处理你说的这些场景。 1. 有什么方式能否保证数据开始处理的时候,基础数据已经缓存好了,可以在流处理中获取到(类似valueState),算子的open函数似乎不能写到valueState里面去,只能初始化state。 目前我们需要使用的外部缓存数据,包含需要调用接口的,需要读取数据库的基础数据。 针对调用接口,我们使用的guava的异步缓存刷新策略 针对数据库中的基础数据,我们类似懒加载,当第一条数据过来的时候,我们会锁住流,等待所有的数据都读取完了,整个流才继续执行。 2.

Re: Re: 请教初始化系统缓存的问题

2019-09-24 Thread haoxin...@163.com
非常感谢,大家同行。 我们目前是确实按照类似你说的这些方式去完成的。但是我们始终觉得应该有更加flink的方式优雅完成,就像维表join。之前一直没有细看,谢谢提醒。 haoxin...@163.com 发件人: 高博 发送时间: 2019-09-24 16:03 收件人: user-zh 主题: Re: 请教初始化系统缓存的问题 你好, 我这里提供几个思路,我们公司做车联网的,目前线上运行的程序,都需要处理你说的这些场景。 1.

Re: Flink ORC 读取问题

2019-09-24 Thread 163
Terry Wang: 通过本地调试,发现Flink orc connector没有问题,是我们自己的schema中有部分字段有大小写,所以在匹配的时候没有匹配到,谢谢! Qi Shu > 在 2019年9月24日,上午11:01,Terry Wang 写道: > > 能否起一个本地程序,设置断点,看看读取数据那块儿逻辑是不是有问题 > Best, > Terry Wang > > > >> 在 2019年9月23日,下午5:11,ShuQi 写道: >> >>