Fwd: Can `DataStream`s "fan-in" to a single sink?

2021-09-22 Thread Antony Southworth
Hi Firstly, apologies if I commit any faux-pas, I have never used a mailing list before. At least from Googling, reading Flink docs, and searching the mailing list archives for "fan-in" didn't turn up much so hoping someone can enlighten me here. My use-case is similar to the following:

Re:re:Re: 回复:Flink SQL官方何时能支持redis和mongodb连接器?

2021-09-22 Thread casel.chen
>我司基于最新提供流批一体的接口,实现了mongodb的连接器,支持source和sink,实现了控制批量插入频率、控制缓存批的数据量和mongo文档对象和java对象转换,同时还可选择批量更新,并且使用的mongo最新异步驱动,后期我还会不断优化性能,看大佬能否推动一下,把这个连接器贡献给社区 问一下自己开发的连接器要怎么添加到 https://flink-packages.org/ 网站给大家搜索到?这位朋友能够将你们的连接器贡献上去呢? 在 2021-09-23 09:32:39,"2572805166"

????

2021-09-22 Thread Elaiza

Flink Session 模式Job日志区分

2021-09-22 Thread Ada Luna
多个Job跑在一个Session中,如何区分不同job的日志呢?目前有什么好的办法吗?

Resource leak would happen if exception thrown when flink redisson

2021-09-22 Thread a773807943
I encountered a problem in the process of integrating Flink and Redisson. When the task encounters abnormalities and keeps retries, it will cause the number of Redis Clients to increase volatility (sometimes the number increases, sometimes the number decreases, but the overall

re:Re: 回复:Flink SQL官方何时能支持redis和mongodb连接器?

2021-09-22 Thread 2572805166
我司基于最新提供流批一体的接口,实现了mongodb的连接器,支持source和sink,实现了控制批量插入频率、控制缓存批的数据量和mongo文档对象和java对象转换,同时还可选择批量更新,并且使用的mongo最新异步驱动,后期我还会不断优化性能,看大佬能否推动一下,把这个连接器贡献给社区 -- 原始邮件 -- 发件人: "Yun Tang"; 发件时间: 2021-09-22 10:55 收件人: "user-zh@flink.apache.org"; 主题: Re: 回复:Flink

??????re:?????? flink sql????????????????sink table?

2021-09-22 Thread ??????
sql??sql?? iPhone -- -- ??: 2572805166 <2572805...@qq.com.INVALID : 2021??9??23?? 09:23 ??: user-zh

??????re:?????? flink sql????????????????sink table?

2021-09-22 Thread ??????
sql??sql iPhone -- -- ??: 2572805166 <2572805...@qq.com.INVALID : 2021923 09:23 ??: user-zh

re:回复: flink sql是否支持动态创建sink table?

2021-09-22 Thread 2572805166
使用java的动态编译和类加载技术,实现类似于web项目的热加载 -- 原始邮件 -- 发件人: "JasonLee"<17610775...@163.com; 发件时间: 2021-09-22 22:33 收件人: "user-zh@flink.apache.org"; 主题: 回复: flink sql是否支持动态创建sink table? hi 事实上这个跟构建 graph 没有太大的关系 也不用在构建后调整 在构造 producer 的时候 topic 不要写死 自定义

Kafka Partition Discovery

2021-09-22 Thread Mason Chen
Hi all, We are sometimes facing a connection issue with Kafka when a broker restarts ``` java.lang.RuntimeException: org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata at

Re: Not able to avoid Dynamic Class Loading

2021-09-22 Thread Kevin Lam
Sorry for the late reply here, I'm just returning to this now. Interesting re: the avro version, we're using 1.10.0 in our application jar. But maybe this is somehow being clobbered when we try to move it into /lib vs. /usrlib to avoid dynamic class loading. Is it possible that's happening? On

Re: Support ARM architecture

2021-09-22 Thread Robert Metzger
Hi, afaik the only real blocker for ARM support was a rocksdb binary for arm. This has been resolved and is scheduled to be released with 1.14.0: https://issues.apache.org/jira/browse/FLINK-13598 If you have an ARM machine available, you could even help the community in the release verification

Re: Many S3V4AuthErrorRetryStrategy warn logs while reading/writing from S3

2021-09-22 Thread Robert Metzger
Hey Andreas, This could be related too https://github.com/apache/hadoop/pull/110/files#diff-0a2e55a2f79ea4079eb7b77b0dc3ee562b383076fa0ac168894d50c80a95131dR950 I guess in Flink this would be s3.endpoint: your-endpoint-hostname Where your-endpoint-hostname is a region-specific endpoint, which

Re: flink rest endpoint creation failure

2021-09-22 Thread Robert Metzger
Hi, Yes, "rest.bind-port" seems to be set to "35485" on the JobManager instance. Can you double check the configuration that is used by Flink? The jobManager is also printing the effective configuration on start up. You'll probably see the value there as well. On Wed, Sep 22, 2021 at 6:48 PM

Support ARM architecture

2021-09-22 Thread Patrick Angeles
Hey all, Trying to follow FLINK-13448. Seems like all the subtasks, save for one on documentation, are completed... does this mean there will be an arm64 binary available in the next release (1.14)?

Many S3V4AuthErrorRetryStrategy warn logs while reading/writing from S3

2021-09-22 Thread Hailu, Andreas
Hi, When reading/writing to and from S3 using the flink-fs-s3-hadoop plugin on 1.11.2, we observe a lot of these WARN log statements in the logs: WARN S3V4AuthErrorRetryStrategy - Attempting to re-send the request to s3.amazonaws.com with AWS V4 authentication. To avoid this warning in the

flink rest endpoint creation failure

2021-09-22 Thread Curt Buechter
Hi, I'm getting an error that happens randomly when starting a flink application. For context, this is running in YARN on AWS. This application is one that converts from the Table API to the Stream API, so two flink applications/jobmanagers are trying to start up. I think what happens is that the

pyflink keyed stream checkpoint error

2021-09-22 Thread Curt Buechter
Hi, I'm getting an error after enabling checkpointing in my pyflink application that uses a keyed stream and rocksdb state. Here is the error message: 2021-09-22 16:18:14,408 INFO org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend [] - Closed RocksDB State Backend. Cleaning up

Re: Unbounded Kafka Source

2021-09-22 Thread Robert Cullen
Robert, So removing the setUnbounded(OffsetInitializer.latest) fixed the issue. Thanks! On Wed, Sep 22, 2021 at 11:51 AM Robert Metzger wrote: > Hi, > > What happens if you do not set any boundedness on the KafkaSource? > For a DataStream job in streaming mode, the Kafka source should be >

RE: Unbounded Kafka Source

2021-09-22 Thread Schwalbe Matthias
Hi, If I remember right, this is actually the intended behaviour: In batch mode: .setBounded(…) In streaming mode: source that finishes anyway at set offset: use .setUnbounded(…) In streaming mode: source that never finishes: don’t set a final offset (don’t .setUnbounded(…)) I might be

Re: Unbounded Kafka Source

2021-09-22 Thread Robert Metzger
Hi, What happens if you do not set any boundedness on the KafkaSource? For a DataStream job in streaming mode, the Kafka source should be unbounded. >From reading the code, it seems that setting unbounded(latest) should not trigger the behavior you mention ... but the Flink docs are not clearly

Re: Flink Performance Issue

2021-09-22 Thread Robert Metzger
Hi Kamaal, I would first suggest understanding the performance bottleneck, before applying any optimizations. Idea 1: Are your CPUs fully utilized? if yes, good, then scaling up will probably help If not, then there's another inefficiency Idea 2: How fast can you get the data into your job,

Re: Observability tools on top of Flink

2021-09-22 Thread Deepak Sharma
I am interested in learning this fact as well as I need to put in observability in a flink pipeline. On Wed, 22 Sep 2021 at 8:40 PM, Dan Hill wrote: > Hi! > > I saw a recent Medium article >

Can't access Debezium metadata fields in Kafka table

2021-09-22 Thread Harshvardhan Shinde
Hi, I'm trying to access the metadata columns from the debezium source connector as documented here . However I'm getting the following error when I try to select the rows from

Observability tools on top of Flink

2021-09-22 Thread Dan Hill
Hi! I saw a recent Medium article

????

2021-09-22 Thread ??????????

回复: flink sql是否支持动态创建sink table?

2021-09-22 Thread spoon_lz
“在 datastream api 任务是可以的” 这样是可行的吗,我的理解flink是要先构建好graph之后才能运行,graph构建好之后可能没办法再动态调整了,除非写一个自定义的sink,自己实现逻辑 在2021年09月22日 19:25,JasonLee<17610775...@163.com> 写道: hi 这个我理解在 SQL 任务里面目前是没办法做到的 在 datastream api 任务是可以的 Best JasonLee 在2021年9月22日 11:35,酷酷的浑蛋 写道:

回复: flink sql是否支持动态创建sink table?

2021-09-22 Thread JasonLee
hi 这个我理解在 SQL 任务里面目前是没办法做到的 在 datastream api 任务是可以的 Best JasonLee 在2021年9月22日 11:35,酷酷的浑蛋 写道: 我也有这个需求,意思就是topic里实时新增了一种日志,然后想动态创建对应新的日志的topic表,并写入到新的topic表,在一个任务中完成 | | apache22 | | apach...@163.com | 签名由网易邮箱大师定制 在2021年09月22日 11:23,Caizhi Weng 写道: Hi! 不太明白这个需求,但如果希望发送给不同的 topic,需要给每个

回复:flink消费kafka分区消息不均衡问题

2021-09-22 Thread JasonLee
hi 图片看不到 我猜大概有两种情况 第一种是你的 source 本身就存在数据倾斜 某几个分区的数据量比其他分区的多 需要修改数据写入 kafka 分区策略让数据尽量均匀 第二种是你的下游计算的时候出现数据倾斜(或其他原因)导致任务反压到 source 端 这种情况需要根据实际的情况采用不同的解决方案 单纯的增加并发和改变 slot 数量没有什么效果 Best JasonLee 在2021年9月22日 09:22,casel.chen 写道: kafka topic有32个分区,实时作业开了32个并行度消费kafka

Re: S3 access permission error

2021-09-22 Thread Harshvardhan Shinde
Hi, I was facing the same issue, the best way to solve this is to use the IAM role (which is the recommended way) instead of the access keys. Hope this helps. On Wed, Sep 22, 2021 at 1:32 PM Yangze Guo wrote: > I'm not an expert on S3. If it is not a credential issue, have you > finish the

Re: Flink Performance Issue

2021-09-22 Thread Mohammed Kamaal
Hi Arvid, The throughput has decreased further after I removed all the rebalance(). The performance has decreased from 14 minutes for 20K messages to 20 minutes for 20K messages. Below are the tasks that the flink application is performing. I am using keyBy and Window operation. Do you think

Re: Stream join with (changing) dimension in Kafka

2021-09-22 Thread Caizhi Weng
Hi! What type of time attribute is u_ts? If it is an event time attribute then this query you're running is an event time temporal table join, which will pause outputting records until the watermark from both inputs has risen above the row time of that record. As the dimension table is changing

Stream join with (changing) dimension in Kafka

2021-09-22 Thread John Smith
Hi, I'm trying to use temporal join in Table API to enrich a stream of pageview events with a slowly changing dimension of user information. The pageview events are in a kafka topic called *pageviews* and the user information are in a kafka topic keyed by *userid* and whenever there is an updated

Re: S3 access permission error

2021-09-22 Thread Yangze Guo
I'm not an expert on S3. If it is not a credential issue, have you finish the checklist of this doc[1]? [1] https://aws.amazon.com/premiumsupport/knowledge-center/emr-s3-403-access-denied/?nc1=h_ls Best, Yangze Guo On Wed, Sep 22, 2021 at 3:39 PM Dhiru wrote: > > > Not sure @yangze ... but

Re: S3 access permission error

2021-09-22 Thread Dhiru
Not sure @yangze ...  but other services which are deployed in same places we are able to access s3 bucket, the link which you share are recommended way, if we have access to s3 then we should not pass credentials ? On Wednesday, September 22, 2021, 02:59:05 AM EDT, Yangze Guo wrote:

Re: S3 access permission error

2021-09-22 Thread Yangze Guo
You might need to configure the access credential. [1] [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/filesystems/s3/#configure-access-credentials Best, Yangze Guo On Wed, Sep 22, 2021 at 2:17 PM Dhiru wrote: > > > i see

Re: S3 access permission error

2021-09-22 Thread Dhiru
i see org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2326) plugin is not able to create folder , not sure if I need to change something Whereas when We are trying to pass from the local laptop and passing  aws credentails its able to create a folder and running as expected  On