Re: 是否可以 hive 流 join hive 流?

2021-10-26 Thread yidan zhao
请问,hive表不支持watermark,是不是和window tvf不支持batch也有关系? 当前hive表如果要分窗口统计是不是不可以用window tvf,是否也是因为hive表不支持time attribute(eventtime+watermark)的原因。 Leonard Xu 于2021年2月1日周一 下午2:24写道: > 还没有,你可以关注下这个issue[1] > > 祝好, > Leonard > [1] https://issues.apache.org/jira/browse/FLINK-21183 > > > 在

Re: s3 access denied with flink-s3-fs-presto

2021-10-26 Thread Parag Somani
Hello, I have successfully been able to store data on S3 bucket. Earlier, I used to have a similar issue. What you need to confirm: 1. S3 bucket is created with RW access(irrespective if it is minio or AWS S3) 2. "flink/opt/flink-s3-fs-presto-1.14.0.jar" jar is copied to plugin directory of

Re: Application mode - Custom Flink docker image with Python user code

2021-10-26 Thread Dian Fu
Hi Sumeet, It still has not provided special support to handle the dependencies for the Application mode in PyFlink. This means that the dependencies could be handled the same as the other deployment modes. However, it is indeed correct that the dependencies could be handled differently in

Re: Application mode - Custom Flink docker image with Python user code

2021-10-26 Thread Shuiqiang Chen
Hi Sumeet, Actually, running pyflink jobs in application mode on kubernetes has been supported since release 1.13. To build a docker image with PyFlink installed, please refer to Enabling Python[1]. In order to run the python code in application mode, you also need to COPY the code files into

Re: FlinkKafkaProducer deprecated in 1.14 but pyflink binding still present?

2021-10-26 Thread Dian Fu
Hi Francis, Yes, you are right. It's still not updated in PyFlink as KafkaSource/KafkaSink are still not supported in PyFlink. Hopeful we could add that support in 1.15 and then we could deprecate/remove the legacy interfaces. Regards, Dian On Tue, Oct 26, 2021 at 12:53 PM Francis Conroy <

Re: flink-yarn的pre-job模式

2021-10-26 Thread Shuiqiang Chen
你好, 上传的图片无法加载。 这种情况是 yarn 无法提供拉起taskmanager,检查下yarn资源是否充足? 王健 <13166339...@163.com> 于2021年10月26日周二 下午7:50写道: > 您好: > 我部署flink yarn的pre-job模式运行报错,麻烦看看是啥原因,非常感谢。 > > 1.运行命令:/usr/local/flink-1.13.2/bin/flink run -t yarn-per-job -c > com.worktrans.flink.wj.ods.FlinkCDC01

Re: How to refresh topics to ingest with KafkaSource?

2021-10-26 Thread Mason Chen
Hi all, I have a similar requirement to Preston. I created https://issues.apache.org/jira/browse/FLINK-24660 to track this effort. Best, Mason > On Oct 18, 2021, at 1:59 AM, Arvid Heise wrote: > > Hi Preston, > > if you still need to set

FlinkKafkaConsumer -> KafkaSource State Migration

2021-10-26 Thread Mason Chen
Hi all, I read these instructions for migrating to the KafkaSource: https://nightlies.apache.org/flink/flink-docs-release-1.14/release-notes/flink-1.14/#deprecate-flinkkafkaconsumer . Do we need to employ any uid/allowNonRestoredState tricks if our Flink job is also stateful outside of the

Re: s3 access denied with flink-s3-fs-presto

2021-10-26 Thread Vamshi G
s3a with hadoop s3 filesystem works fine for us wit sts assume role credentials and with kms. Below are how our hadoop s3a configs look like. Since the endpoint is globally whitelisted, we don't explicitly mention the endpoint. fs.s3a.aws.credentials.provider:

Re: Flink handle both kafka source and db source

2021-10-26 Thread Rafi Aroch
Hi, Take a look at the new 1.14 feature called Hybrid Source: https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/hybridsource/ Rafi On Tue, Oct 26, 2021 at 7:46 PM Qihua Yang wrote: > Hi, > > My flink app has two data sources. One is from a Kafka topic, one

Re: SplitEnumeratorContext callAsync() cleanup

2021-10-26 Thread Mason Chen
Hi Fabian, Unfortunately, I don't have the log since I was just testing it out on my local setup. I can try to reproduce it later in the week. Best, Mason On Mon, Oct 25, 2021 at 8:09 AM Fabian Paul wrote: > Hi Mason, > > Thanks for opening the ticket. Can you also share the log with us when

Re: Flink support for Kafka versions

2021-10-26 Thread Prasanna kumar
Hi , We are using Kafka broker version 2.4.1.1. Also kafka client 2.4.1.1 jar which is part of flink kafka connector recently was marked with high security issue. So we excluded the dependency and overriden it with kafka client 2.8.1 client jar and it works fine with the 2.4.1.1 broker. ( since

Re: Flink JDBC connect with secret

2021-10-26 Thread Qihua Yang
Hi Jing, Thank you for your suggestion. I will check if SSL parameters in URL works. Thanks, Qihua On Sat, Oct 23, 2021 at 8:37 PM JING ZHANG wrote: > Hi Qihua, > I checked user documents of several database vendors(postgres, oracle, > solidDB,SQL server)[1][2][3][4][5], and studied how to

RE: Huge backpressure when using AggregateFunction with Session Window

2021-10-26 Thread Schwalbe Matthias
Hi Ori, … answering from remote … * If not completely mistaken, Scala Vector is immutable, creating a copy whenever you append, but * This is not the main problem, the vectors collected so far get deserialized with every incoming event (from state storage) and afterward serialized

Re: Troubleshooting checkpoint timeout

2021-10-26 Thread Piotr Nowojski
I'm glad that I could help :) Piotrek pon., 25 paź 2021 o 16:04 Alexis Sarda-Espinosa < alexis.sarda-espin...@microfocus.com> napisał(a): > Oh, I got it. I should’ve made the connection earlier after you said “Once > an operator decides to send/broadcast a checkpoint barrier downstream, it >

flink-yarn的pre-job模式

2021-10-26 Thread 王健
您好: 我部署flink yarn的pre-job模式运行报错,麻烦看看是啥原因,非常感谢。 1.运行命令:/usr/local/flink-1.13.2/bin/flink run -t yarn-per-job -c com.worktrans.flink.wj.ods.FlinkCDC01 /usr/local/flink-1.13.2/flink_x.jar 提交正常,如图: 2.yarn 截图 3.flink截图:

Re: Re: Flink任务每运行20天均会发生内部异常

2021-10-26 Thread mayifan
非常感谢大佬的答复: 目前从任务来看的话总共存在三个任务,其中两个异常任务分别使用了1到2个MapState,过期时间均为1天或3天。 正常运行的任务使用了MapState及ListState各4个,过期时间为60min-120min。 异常任务在产生异常后从checkpoint重启又会恢复正常。 > -- 原始邮件 -- > 发 件 人:"Caizhi Weng" > 发送时间:2021-10-26 18:45:44 > 收 件 人:"flink中文邮件组" > 抄 送: > 主

RE: Using POJOs with the table API

2021-10-26 Thread Alexis Sarda-Espinosa
Hello, I've found a ticket that talks about very high-level improvements to the Table API [1]. Are there any more concrete pointers for migration from DataSet to Table API? Will it be possible at all to use POJOs with the Table API? [1] https://issues.apache.org/jira/browse/FLINK-20787

??????flink keyby??????????????????

2021-10-26 Thread yuankuo.xia
filter??filter?? ---- ??: "user-zh"

Re: flink写mysql问题

2021-10-26 Thread Caizhi Weng
Hi! Flink 1.11 对 jdbc 在流作业中的支持确实不完善,在流作业做 checkpoint 时没有处理。如果需要在流作业中使用 jdbc sink,建议升级到比较新的 1.13 或 1.14。 zya 于2021年10月26日周二 下午4:56写道: > 你好,感谢回复 > 在任务做检查点的时候,内存中缓存的一批数据如何 flush 到 mysql 中的呢? > > > 我用的是1.11.2版本的flink >

Re: Flink任务每运行20天均会发生内部异常

2021-10-26 Thread Caizhi Weng
Hi! 听起来和 state 过期时间非常有关。你配置了哪些和 state 过期相关的参数?是否有 20 天过期的 state? mayifan 于2021年10月26日周二 下午4:43写道: > Hi! > > 麻烦请教大家一个问题。 > > > 有三个Flink任务以yarn-per-job模式运行在Flink-1.11.2版本的集群上,均使用RocksDB作为状态后端,数据以增量的方式写入RocksDB,且均配置了状态过期时间。 > > >

Re: flink keyby之后数据倾斜的问题

2021-10-26 Thread Caizhi Weng
Hi! Flink SQL 里已经内置了很多解倾斜的方式,例如 local global 聚合。详见 [1],如果一定要使用 streaming api 可以参考该思路进行优化。 [1] https://ci.apache.org/projects/flink/flink-docs-master/zh/docs/dev/table/tuning/#local-global-%e8%81%9a%e5%90%88 xiazhl 于2021年10月26日周二 下午2:31写道: > hello everyone! >

Re: Not cleanup Kubernetes Configmaps after execution success

2021-10-26 Thread Roman Khachatryan
Thanks for sharing this, The sequence of events the log seems strange to me: 2021-10-17 03:05:55,801 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Close ResourceManager connection c1092812cfb2853a5576ffd78e346189: Stopping JobMaster for job 'rt-match_12.4.5_8d48b21a'

flink keyby??????????????????

2021-10-26 Thread yuankuo.xia
hello everyone?? ??keyby?? ??flink streamAPI?? 10 ??flinkidkeyby id

??????Flink????Operator????????????Metrics

2021-10-26 Thread yuankuo.xia
web-ui??metrics ---- ??: "user-zh"

Re: Async Performance

2021-10-26 Thread Arvid Heise
Hi Sanket, if you have a queue of 1000, then 1000 will be used in AsyncIO. Memory doesn't matter. What you need to double-check is if your async library can handle that many elements in parallel. The AsyncHttpClient should have a thread pool that effectively will put an upper limit on how many

?????? flink??mysql????

2021-10-26 Thread zya
?? ?? flush ?? mysql 1.11.2??flink sqlBufferReduceStatementExecutorflush??GenericJdbcSinkFunction

Flink任务每运行20天均会发生内部异常

2021-10-26 Thread mayifan
Hi! 麻烦请教大家一个问题。 有三个Flink任务以yarn-per-job模式运行在Flink-1.11.2版本的集群上,均使用RocksDB作为状态后端,数据以增量的方式写入RocksDB,且均配置了状态过期时间。 任务逻辑大致都是通过状态与历史数据进行自关联或双流join,每输入一条数据都会产出等量、1/2或多倍的数据到下游,当数据无法通过状态关联,任务则无法向下游产出数据。

Re: Flink没有Operator级别的数据量Metrics

2021-10-26 Thread Ada Luna
Web-UI中的就是Flink原生正常的Metrics,都是Task级别 xiazhl 于2021年10月26日周二 下午2:31写道: > > web-ui里面有metrics > > > > > --原始邮件-- > 发件人: > "user-zh"

Re: Fwd: How to deserialize Avro enum type in Flink SQL?

2021-10-26 Thread Peter Schrott
Hi people, I found a workaround for that issue - which works at least for my use case. The main idea was customizing "org.apache.flink.formats.avro.registry.confluent.RegistryAvroFormatFactory" such that the expected avro schema is not gained from the CREATE TABLE SQL statement, rather than

Application mode - Custom Flink docker image with Python user code

2021-10-26 Thread Sumeet Malhotra
Hi, I'm currently submitting my Python user code from my local machine to a Flink cluster running in Session mode on Kubernetes. For this, I have a custom Flink image with Python as per this reference [1]. Now, I'd like to move to using the Application mode with Native Kubernetes, where the user

??????Flink????Operator????????????Metrics

2021-10-26 Thread xiazhl
web-ui??metrics ---- ??: "user-zh"

flink keyby??????????????????

2021-10-26 Thread xiazhl
hello everyone?? ??keyby?? ??flink streamAPI?? 10 ??flinkidkeyby id

Re: Re: Flink SQL 1.12 批量数据导入,如果加速性能

2021-10-26 Thread WuKong
Hi: 就是最简单的 定义一个Source table 一个Sink table 相同的Schema , 比如 insert into tableB select * from tableA ; 执行启8个并行度的话, 会有个7个并行度是Finish 状态 只有一个 在串行的导入数据, 其中schema 例如: CREATE TABLE tableA ( columnOne STRING, columnTwo BIGINT, PRIMARY KEY (`columnTwo `) NOT ENFORCED ) WITH ( 'connector' = 'jdbc',