[jira] [Commented] (FLINK-31304) Very slow job start if topic has been used before

Martijn Visser (Jira) Thu, 02 Mar 2023 09:32:07 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-31304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17695838#comment-17695838
 ]


Martijn Visser commented on FLINK-31304:
----------------------------------------

[~YordanPavlov] Do you also start with a new transactional ID prefix for your 
application? If not, these are expected to be unique for each Flink application 
that you run on the same Kafka cluster. Given that you're basically starting a 
job from new, that would also require a new transactional ID prefix. 

> Very slow job start if topic has been used before
> -------------------------------------------------
>
>                 Key: FLINK-31304
>                 URL: https://issues.apache.org/jira/browse/FLINK-31304
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / Kafka
>    Affects Versions: 1.15.2
>            Reporter: Yordan Pavlov
>            Priority: Major
>
> We have the following use case. We use KafkaSink with Exactly once semantic, 
> from time to time we would re-start the job clean, in doing so we delete and 
> re-create the output topic and also any Flink checkpoints. In such situation 
> it would take close to an hour for Flink to start. In the the time the job is 
> idling we would see the following log in the Taskmanager:
> {code:java}
> 2023-03-02 16:33:42.004 [Source: Kafka source blocks -> Deduplicate blocks -> 
> Map -> Parse blocks -> Map -> Kafka sink volume: Writer -> Kafka sink volume: 
> Committer (2/5)#0] INFO  
> o.apache.kafka.clients.producer.internals.TransactionManager  - [Producer 
> clientId=producer-state.clickhouse-0-1-1, 
> transactionalId=state.clickhouse-0-1-1] Invoking InitProducerId for the first 
> time in order to acquire a producer ID
> 2023-03-02 16:33:42.005 [kafka-producer-network-thread | 
> producer-state.clickhouse-0-2-1] INFO  
> o.apache.kafka.clients.producer.internals.TransactionManager  - [Producer 
> clientId=producer-state.clickhouse-0-2-1, 
> transactionalId=state.clickhouse-0-2-1] ProducerId set to 31719488 with epoch 
> 8{code}
> If we use a brand new output topic name, the job would start straight away. 
> Could you advise if this can be improved?
> Such logs would go on and on in what seems forever.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-31304) Very slow job start if topic has been used before

Reply via email to