Re: [External] naming table stages

2021-07-28 Thread JING ZHANG
Hi Yuval, Thanks for pointing that out. Yes, metrics names would also be affected. The need sounds reasonable, we could create a JIRA and give a detailed description of the requirements. cc @godfrey @Kurt to provide more user perspective, they may be interested in the feature. Best, JING ZHANG

Re: Table Aggregate - Flink SQL

2021-07-28 Thread JING ZHANG
Hi Pranav, Table Aggregate and Window Table Aggregate are now only supported in TableAPI, they are not supported by SQL yet. I think the most challenge is how to support those two features (Table Aggregate and Window Table Aggregate) based on standard SQL gramma since aggregate means return a sin

Re: TaskManager crash after cancelling a job

2021-07-28 Thread Yangze Guo
In your case, the entry point is the `cleanUpInvoke` function called by `StreamTask#invoke`. @ro...@apache.org Could you take another look at this? Best, Yangze Guo On Thu, Jul 29, 2021 at 2:29 AM Ivan Yang wrote: > > Hi Yangze, > > I deployed 1.13.1, same problem exists. It seems like that the

Re: TaskManager crash after cancelling a job

2021-07-28 Thread Ivan Yang
Hi Yangze, I deployed 1.13.1, same problem exists. It seems like that the cancel logic has changed since 1.11.0 (which was the one we have been running for almost 1 year). In 1.11.0, during the cancellation, we saw some subtask stays in the cancelling state for sometime, but eventually the job

Table Aggregate - Flink SQL

2021-07-28 Thread Pranav Patil
I want to create a Python UDF for a table aggregate function. The documentation explains this, and how to use its results by calling the flatAggregate function. However, I would not like to use the Table API. I would like to call the table aggregate function from Flink SQL. I'm using Flink 1.13. My

Re: [External] naming table stages

2021-07-28 Thread Yuval Itzchakov
Hi Jing, An additional challenge with the current Table API / SQL approach for iperator naming is that it makes it very hard to export metrics, i.e. to track watermarks with Prometheus, when operator names are not assignable by the user. On Wed, Jul 28, 2021, 13:11 JING ZHANG wrote: > Hi Clemen

Checkpoints question

2021-07-28 Thread Kirill Kosenko
Hello I'm new to Flink. I am playing with Stateful Functions and have a question about checkpoints and how they work. Some configuration details: state.backend: rocksdb state.backend.incremental: true execution.checkpointing.mode: AT_LEAST_ONCE As far as I know: 1. There is a sync checkp

Re: Issue with Flink SQL using RocksDB backend

2021-07-28 Thread Timo Walther
Hi Yuval, having a locally reproducible result would be great. Also more information about the used data types. Because this could be a serializer issue that messes up the binary format. Regards, Timo On 27.07.21 07:37, Yuval Itzchakov wrote: Hi Jing, Yes, FIRST is a UDAF. I've been tryi

Re: foreach exec sql

2021-07-28 Thread Timo Walther
Btw you are executing a lot of Flink jobs in parallel with this because the submission is async. Maybe the concept of a StatementSet via TableEnvironment.createStatementSet() helps. Regards, Timo On 27.07.21 10:56, Caizhi Weng wrote: Hi! Try this: sql.zipWithIndex.foreach { case (sql, idx)

Re: [External] naming table stages

2021-07-28 Thread JING ZHANG
Hi Clemens, This feature is temporarily not supported. Your needs sounds reasonable, you could create a JIRA ticket. Now operator name contains a lot of detailed information, it has advantages and disadvantages. When we need troubleshoot problems, detailed information could help us to know what o

Calling a stateful fuction from Flink Job - DataStream Integration

2021-07-28 Thread Deniz Koçak
Hi, We would like to host a separate service (logically and possibly physically) from the Flink Job which also we would like to call from our Flink Job. We have been considering remote functions via HTTP which seems to be good option on cloud. We have been considering Async I/o and statefun to use

Re: Session Windows should have a max size

2021-07-28 Thread Caizhi Weng
Hi! You can use DynamicProcessingTimeSessionWindows with your own SessionWindowTimeGapExtractor implementation. You can count the number of records processed in the extractor and return a time gap of almost zero (but not exactly zero, as it is invalid) if the number of records exceeds the limit.

Session Windows should have a max size

2021-07-28 Thread Prashant Deva
It seems there is no way to set a maximum size of events for a session window. This results in a security vulnerability. Example: I am recording all the user interaction events of a browser session. A malicious user can then generate hundreds of thousands or even millions of events, and cause out o

Re: IllegalArgumentException: The configured managed memory fraction for Python worker process must be within (0, 1], was: %s

2021-07-28 Thread Dian Fu
Have you mixed use of Python Table API and Python DataStream API and converted DataStream to Table in your program? If so, there is an issue https://issues.apache.org/jira/browse/FLINK-23133 which may be related (already fixed in 1.12.5 ). PS