The fault tolerance and recovery mechanism in batch mode within Apache Flink.

2024-02-16 Thread Вова Фролов
Hi everyone, I am currently exploring the fault tolerance and recovery mechanism in batch mode within Apache Flink. If I terminate the task manager process while the job is running, the job restarts from the point of failure. However, at some point, the job restarts from the very beginning. The

Re: Long execution of SQL query to Kafka + Hive (q77 TPC-DS)

2024-01-26 Thread Вова Фролов
by rollup (channel, id) order by channel ,id LIMIT 100; Kind regards, Vladimir. чт, 25 янв. 2024 г. в 14:43, Ron liu : > Hi, > > Can you help to explain the q77 execution plan? And find which operator > takes a long time in flink UI? > > Best > Ron > > Вова Фр

Long execution of SQL query to Kafka + Hive (q77 TPC-DS)

2024-01-23 Thread Вова Фролов
Hello, I am executing a heterogeneous SQL query (part of the data is in Hive and part in Kafka. The query utilizes TPC-DS benchmark 100GB data.) in BatchMode. However, the execution time is excessively long, taking approximately 11 minutes to complete , although the request to Hive only (without

Flink caching mechanism

2024-01-11 Thread Вова Фролов
Hi Everyone, I'm currently looking to understand the caching mechanism in Apache Flink in general. As part of this exploration, I have a few questions related to how Flink handles data caching, both in the context of SQL queries and more broadly. When I send a SQL query for example to

Issue with Flink Job when Reading Data from Kafka and Executing SQL Query (q77 TPC-DS)

2023-12-19 Thread Вова Фролов
Hello Flink Community, I am texting to you with an issue I have encountered while using Apache Flink version 1.17.1. In my Flink Job, I am using Kafka version 3.6.0 to ingest data from TPC-DS(current tpcds100 target size tpcds1), and then I am executing SQL queries, specifically, the q77