回复: Elasticsearch sink connector timeout

2021-06-06 Thread Jacky Yin 殷传旺
In flink-es connector 6.*, you can set the socket timeout by implementing a customized RestClientFactory。 Here is the code snippet. @Override public void configureRestClientBuilder(RestClientBuilder restClientBuilder) { restClientBuilder

Re: 关于flink sql的kafka source的开始消费offset相关问题。

2021-06-06 Thread JasonLee
hi 那你只需要设置从 latest-offset 开始消费,并且禁用 checkpoint 就行了,至于重启的次数,可以通过 metrics 中的 numRestarts 去获取. - Best Wishes JasonLee -- Sent from: http://apache-flink.147419.n8.nabble.com/

flink on yarn日志清理

2021-06-06 Thread zjfpla...@hotmail.com
大家好, 请问下如下问题: flink on yarn模式,日志清理机制有没有的? 是不是也是按照log4j/logback/log4j2等的清理机制来的?还是yarn上配置的。 是实时流作业,非离线一次性作业,一直跑着的 zjfpla...@hotmail.com

Re: Elasticsearch sink connector timeout

2021-06-06 Thread Yangze Guo
Hi, Kai, I think the exception should be thrown from RetryRejectedExecutionFailureHandler as you configure the 'failure-handler' to 'retry-rejected'. It will retry the action that failed with EsRejectedExecutionException and throw all other failures. AFAIK, there is no way to configure the

Re: Question about State TTL and Interval Join

2021-06-06 Thread Yun Tang
Hi Chris, Interval Join should clean state which is not joined during interval and you don't need to set state TTL. (Actually, the states used in interval join are not exposed out and you cannot set TTL for those state as TTL is only public for user self-described states.) The checkpoint size

Re: Multiple Exceptions during Load Test in State Access APIs with RocksDB

2021-06-06 Thread Chirag Dewan
Thanks for the reply Yun. I strangely don't see any nulls. And infact this exception comes on the first few records and then job starts processing normally. Also, I don't see any reason for Concurrent access to the state in my code. Could more CPU cores than task slots to the Task Manager be

Re: 关于flink sql的kafka source的开始消费offset相关问题。

2021-06-06 Thread Yun Tang
hi, 本质上来说,你的做法有点hack其实不推荐,如果非要这么做的话,你还可以通过 numRestarts [1] 的指标来看重启了多少次。 [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/metrics/#availability 祝好 唐云 From: yidan zhao Sent: Friday, June 4, 2021 11:52 To: user-zh Subject: Re: 关于flink

Re: Flink app performance test framework

2021-06-06 Thread Yangze Guo
Hi, Luck, I may not fully understand your requirements. If you just want to test the performance of typical streaming jobs with the Flink, you can refer to the nexmark[1]. If you just care about the performance regression of your specific production jobs, I don't know there is such a framework.

Re: Add control mode for flink

2021-06-06 Thread 刘建刚
Thank you for the reply. I have checked the post you mentioned. The dynamic config may be useful sometimes. But it is hard to keep data consistent in flink, for example, what if the dynamic config will take effect when failover. Since dynamic config is a desire for users, maybe flink can support

回复:flink1.12版本,yarn-application模式Flink web ui看不到日志

2021-06-06 Thread smq

Re: [DISCUSS] FLIP-148: Introduce Sort-Merge Based Blocking Shuffle to Flink

2021-06-06 Thread Jingsong Li
Thanks Yingjie for the great effort! This is really helpful to Flink Batch users! Best, Jingsong On Mon, Jun 7, 2021 at 10:11 AM Yingjie Cao wrote: > Hi devs & users, > > The FLIP-148[1] has been released with Flink 1.13 and the final > implementation has some differences compared with the

flink sql cdc作数据同步作业数太多

2021-06-06 Thread casel.chen
flink sql cdc作数据同步,因为是基于库+表级别的,表数量太多导致作业数太多。请问能否用flink sql cdc基于库级别同步?这样作业数量会少很多。

Re: Flink checkpoint 速度很慢 问题排查

2021-06-06 Thread yidan zhao
可以的,本身异步操作的本质就是线程池。 至于是你自己提供线程池,去执行某个同步操作。还是直接使用client/sdk等封装的异步方法内部默认的线程池这个无所谓。 Jacob <17691150...@163.com> 于2021年6月5日周六 下午1:15写道: > > thanks, > > 我查看了相关文档[1] 由于redis以及hbase的交互地方比较多,比较零散,不光是查询,还有回写redis > > 我打算把之前map算子的整段逻辑以线程池的形式丢在asyncInvoke()方法内部,不知道合适与否,这样数据的顺序性就无法得到保障了吧? > > > > [1] >

Re: [DISCUSS] FLIP-148: Introduce Sort-Merge Based Blocking Shuffle to Flink

2021-06-06 Thread Yingjie Cao
Hi devs & users, The FLIP-148[1] has been released with Flink 1.13 and the final implementation has some differences compared with the initial proposal in the FLIP document. To avoid potential misunderstandings, I have updated the FLIP document[1] accordingly and I also drafted another

ubsubscribe

2021-06-06 Thread Zhipeng Zhang
-- best, Zhipeng

Re: Re: Is it possible to use OperatorState, when NOT implementing a source or sink function?

2021-06-06 Thread Yun Gao
Hi Marco, It seems to me that the imbalance problem and the state is independent for this issue: the data distribution is only decided by the KeySelector used. The only limitation for state is that the keyed state is bind to the KeySelector used across the tasks. If the imbalance is the root

Re: Re: Re: Failed to cancel a job using the STOP rest API

2021-06-06 Thread Yun Gao
Hi Thoms, Very thanks for reporting the exceptions, and it seems to be not work as expected to me... Could you also show us the dag of the job ? And does some operators in the source task use multiple-threads to emit records? Best, Yun --Original Mail --

Is it possible to customize avro schema name when using SQL

2021-06-06 Thread tao xiao
Hi team, I want to use avro-confluent to encode the data using SQL but the schema registered by the encoder hard code the schema name to 'record'. is it possible to dictate the name? -- Regards, Tao