Re: 退订

2021-09-09 Thread Caizhi Weng
Hi! 退订中文邮件列表请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org,其他邮件列表退订邮箱参见 https://flink.apache.org/community.html#mailing-lists lutty shi 于2021年9月10日周五 下午1:37写道: > 退订 > > > > > > > > 发自 > target='_blank' style='text-decoration: none; color: #12'>网易邮箱大师 > > >

退订

2021-09-09 Thread lutty shi
退订 发自 网易邮箱大师

Re: Re:Re: Re: Temporal Joins 报 Currently the join key in Temporal Table Join can not be empty

2021-09-09 Thread Caizhi Weng
Hi! 感谢持续反馈。这次确实在开发环境下复现了这个问题。 这是因为目前 event time temporal join 对 join key 的处理还不够完善,目前只能处理原原本本的输入列(比如直接用 uuid),暂时不能处理利用输入列做的运算(例如这里从 map 里取值)。我已经开了一个 issue 记录这个问题 [1]。一个绕过的方法是先在一个 view 里给 map 取值,比如这样: CREATE TABLE A ( a MAP, ts TIMESTAMP(3), WATERMARK FOR ts AS ts ); CREATE TABLE B ( id INT, ts

Re: Usecase for flink

2021-09-09 Thread JING ZHANG
Hi Dipanjan, Base your description, I think Flink could handle this user case. Don't worry that Flink can't handle this kind of data scale because Flink is a distributed engine. As long as the problem of data skew is carefully avoided, the input throughput can be handled through appropriate

Re: State processor API very slow reading a keyed state with RocksDB

2021-09-09 Thread Yun Tang
Hi David, I think Seth had shared some useful information. If you want to know what happened within RocksDB when you're reading, you can leverage async-profiler [1] to catch the RocksDB stacks and I guess that index block might be evicted too frequently during your read. And we could use new

Usecase for flink

2021-09-09 Thread Dipanjan Mazumder
Hi,    I am working on a usecase and thinking of using flink for the same. The use case is i will be having many large resource graphs , i need to parse that graph for each node and edge and evaluate each one of them against some suddhi rules , right now the implementation for evaluating

Re: flink cdc SQL2ES,GC overhead limit exceeded

2021-09-09 Thread Caizhi Weng
Hi! 是否尝试调大过堆内存呢?调大后还是如此吗?方便的话能否提供一下 SQL 文本? LI YUAN <27297...@qq.com.invalid> 于2021年9月10日周五 上午9:39写道: > java.lang.OutOfMemoryError: GC overhead limit exceeded > at org.rocksdb.RocksIterator.key0(Native Method) > at org.rocksdb.RocksIterator.key(RocksIterator.java:37) > at >

Flink rest接口返回数据,以及ui展示数据异常。

2021-09-09 Thread yidan zhao
如题,目前我主要考察了watermark,我的window正常能输出,但是reset接口以及ui上展示的watermark明显严重延迟。 但实际watermark肯定正常,因为窗口正常触发。 这个代码未改动过,以前正常。 而且,目前感觉从浏览器network监控来看,返回的数据貌似不变一直。

flink cdc SQL2ES??GC overhead limit exceeded

2021-09-09 Thread LI YUAN
java.lang.OutOfMemoryError: GC overhead limit exceeded at org.rocksdb.RocksIterator.key0(Native Method) at org.rocksdb.RocksIterator.key(RocksIterator.java:37) at org.apache.flink.contrib.streaming.state.RocksIteratorWrapper.key(RocksIteratorWrapper.java:99) at

Re: Documentation for deep diving into flink (data-streaming) job restart process

2021-09-09 Thread Puneet Duggal
Hi, Please find attached logfile regarding job not getting restarted on another task manager once existing task manager got restarted. Just FYI - We are using Fixed Delay Restart (5 times, 10s delay) On Thu, Sep 9, 2021 at 4:29 PM Robert Metzger wrote: > Hi Puneet, > > Can you provide us with

Re: Job manager crash

2021-09-09 Thread mejri houssem
thanks for the response, with respect to the api-server i don't think i can do so much about it because i am just using a specific namespace in kubernetes cluster, it's not me who administrate the cluster. otherwise i will try the gc log option to see if can find something useful in order to

Re: Job manager crash

2021-09-09 Thread houssem
Hello , with respect to the api-server i dotn re On 2021/09/09 11:37:49, Yang Wang wrote: > I think @Robert Metzger is right. You need to check > whether your Kubernetes APIServer is working properly or not(e.g. > overloaded). > > Another hint is about the fullGC. Please use the following

Re: Issue while creating Hive table from Kafka topic

2021-09-09 Thread Robert Metzger
Does the jar file you are trying to submit contain the org/apache/kafka/common/serialization/ByteArrayDeserializer class? On Thu, Sep 9, 2021 at 2:10 PM Harshvardhan Shinde < harshvardhan.shi...@oyorooms.com> wrote: > Here's the complete stack trace: > > Server

Re: Job crashing with RowSerializer EOF exception

2021-09-09 Thread Robert Metzger
Hi Yuval, EOF exceptions during serialization are usually an indication that some serializers in the serializer chain is somehow broken. What data type are you serializating? Does it include some type serializer by a custom serializer, or Kryo, ... ? On Thu, Sep 9, 2021 at 4:35 PM Yuval

Re: State processor API very slow reading a keyed state with RocksDB

2021-09-09 Thread Seth Wiesman
Hi David, I was also able to reproduce the behavior, but was able to get significant performance improvements by reducing the number of slots on each TM to 1. My suspicion, as Piotr alluded to, has to do with the different runtime execution of DataSet over DataStream. In particular, Flink's

Re:Re:Re: Re: Temporal Joins 报 Currently the join key in Temporal Table Join can not be empty

2021-09-09 Thread Wayne
Hello 打扰了我最近再次尝试,还是报这个错误 我的flink版本为 flink-1.12.2-bin-scala_2.12 使用sql client执行 我的sql 如下 CREATE TABLE stub_trans ( `uuid` STRING, `columnInfos` MAP NOT NULL> NOT NULL, procTime TIMESTAMP(3) METADATA FROM 'timestamp' , WATERMARK FOR procTime AS procTime ) WITH ( 'connector' =

DataStreamAPI and Stateful functions

2021-09-09 Thread Barry Higgins
Hi, I'm looing at using the DataStream API from a Flink application against a remote python stateful function deployed on another machine. I would like to investigate how feasible it is to have all of the state management being handled from the calling side meaning that we don't need another

Job crashing with RowSerializer EOF exception

2021-09-09 Thread Yuval Itzchakov
Hi, Flink 1.13.2 Scala 2.12.7 Running an app in production, I'm running into the following exception that frequently fails the job: switched from RUNNING to FAILED with failure cause: java.io.IOException: Can't get next record for channel InputChannelInfo{gateIdx=0, inputChannelIdx=2}\n\tat

Re: Duplicate copies of job in Flink UI/API

2021-09-09 Thread Chesnay Schepler
I'm afraid there's no real workaround. If the information for completed jobs isn't important to you then setting jobstore.expiration-time to a low value can reduce the impact, or setting jobstore.max-capacity to 0 would prevent any completed job from being displayed. Beyond that I can't

Re: Duplicate copies of job in Flink UI/API

2021-09-09 Thread Peter Westermann
Thanks Chesnay. You are understanding this correctly. Your explanation makes sense to me. Is there anything we can do to prevent this? At least for us, most times a leader election happens, the leader doesn’t actually change because the jobmanager is still healthy. Thanks, Peter From:

Re: Duplicate copies of job in Flink UI/API

2021-09-09 Thread Chesnay Schepler
Just to double-check that I'm understanding things correctly: You have a job with HA, then Zookeeper breaks down, the job gets suspended, ZK comes back online, and the _same_ JobManager becomes the leader? If so, then I can explain why this happens and hopefully reproduce it. In short, when

Re: Issue while creating Hive table from Kafka topic

2021-09-09 Thread Harshvardhan Shinde
Here's the complete stack trace: Server Response:org.apache.flink.runtime.rest.handler.RestHandlerException: Could not execute application. at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleRequest$1(JarRunHandler.java:108) at

Re: Issue while creating Hive table from Kafka topic

2021-09-09 Thread Robert Metzger
Can you share the full stack trace, not just a part of it? On Thu, Sep 9, 2021 at 1:43 PM Harshvardhan Shinde < harshvardhan.shi...@oyorooms.com> wrote: > Hi, > > I added the dependencies while trying to resolve the same issue, thought I > was missing them. > > Thanks > > On Thu, Sep 9, 2021 at

Re: Allocation-preserving scheduling and task-local recovery

2021-09-09 Thread Robert Metzger
Hi, from my understanding of the code [1], the task scheduling first considers the state location, and then uses the evenly spread out scheduling strategy as a fall back. So in my understanding of the code, the local recovery should have preference over the evenly spread out strategy. If you can

Re: Issue while creating Hive table from Kafka topic

2021-09-09 Thread Harshvardhan Shinde
Hi, I added the dependencies while trying to resolve the same issue, thought I was missing them. Thanks On Thu, Sep 9, 2021 at 4:26 PM Robert Metzger wrote: > Hey, > > Why do you have these dependencies in your pom? > > > > org.apache.kafka >

Re: Job manager crash

2021-09-09 Thread Yang Wang
I think @Robert Metzger is right. You need to check whether your Kubernetes APIServer is working properly or not(e.g. overloaded). Another hint is about the fullGC. Please use the following config option to enable the GC logs and check the full gc time. env.java.opts.jobmanager: -verbose:gc

Re: Duplicate copies of job in Flink UI/API

2021-09-09 Thread Peter Westermann
Hi Piotr, Jobmanager logs are attached to this email. The only thing that jumps out to me is this: 09/08/2021 09:02:26.240 -0400 ERROR org.apache.flink.runtime.history.FsJobArchivist Failed to archive job. java.io.IOException: File already

Re: Documentation for deep diving into flink (data-streaming) job restart process

2021-09-09 Thread Robert Metzger
Hi Puneet, Can you provide us with the JobManager logs of this incident? Jobs should not disappear, they should restart on other Task Managers. On Wed, Sep 8, 2021 at 3:06 PM Puneet Duggal wrote: > Hi, > > So for past 2-3 days i have been looking for documentation which > elaborates how flink

Re: Issue while creating Hive table from Kafka topic

2021-09-09 Thread Robert Metzger
Hey, Why do you have these dependencies in your pom? org.apache.kafka kafka-clients 2.8.0 org.apache.kafka kafka_2.12 2.8.0 They are not needed for using the Kafka connector of

Re: Job manager crash

2021-09-09 Thread Robert Metzger
Is the kubernetes server you are using particularly busy? Maybe these issues occur because the server is overloaded? "Triggering checkpoint 2193 (type=CHECKPOINT) @ 1630681482667 for job ." "Completed checkpoint 2193 for job (474

Issue while creating Hive table from Kafka topic

2021-09-09 Thread Harshvardhan Shinde
Hi, I'm trying a simple flink job that reads data from a kafka topic and creates a Hive table. I'm following the steps from here . Here's my code: import

hbase 列设置TTL过期后,flink不能再写入数据

2021-09-09 Thread xiaohui zhang
Flink:1.12.1 Flink-connector: 2.2 Hbase: 2.1.0 + CDH6.3.2 现象:如果hbase列族设置了TTL,当某一rowkey写入数据,到达过期时间,列族会被hbase标记为删除。 后续如果有相同key的数据过来,flink无法将数据写入到hbase中,查询hbase中列族一直为空。 执行的过程大致如下: 创建Hbase表,test, 两个列族 cf1 , TTL 60, cf2, TTL 120, 数据TTL分别为1分钟,2分钟。 使用sql写入数据至表中 insert into test select 'rowkey',

Re: aws s3 configuring error for flink image

2021-09-09 Thread Chesnay Schepler
This is a limitation of the presto version; use flink-s3-fs-hadoop-1.11.3.jar instead. On 08/09/2021 20:39, Dhiru wrote: I copied FROM flink:1.11.3-scala_2.12-java11 RUN mkdir ./plugins/flink-s3-fs-presto RUN cp ./opt/flink-s3-fs-presto-1.11.3.jar   ./plugins/flink-s3-fs-presto/ then started

Re: State processor API very slow reading a keyed state with RocksDB

2021-09-09 Thread Piotr Nowojski
Hi David, I can confirm that I'm able to reproduce this behaviour. I've tried profiling/flame graphs and I was not able to make much sense out of those results. There are no IO/Memory bottlenecks that I could notice, it looks indeed like the Job is stuck inside RocksDB itself. This might be an