[jira] [Commented] (KAFKA-17704) possible race condition in TTL credentials when connectors recycled on single node instance

Greg Harris (Jira) Thu, 10 Oct 2024 11:36:53 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-17704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888394#comment-17888394
 ]


Greg Harris commented on KAFKA-17704:
-------------------------------------

[~pavlov2000uk] This is caused by the AppliedConnectorConfig's caching 
behavior. When new task configs are published at 13:53, a new 
AppliedConnectorConfig object is created, which doesn't have the secrets 
resolved. At 13:55 this object has secrets resolved, and it gets the new 
secrets twice, which look equal. If you were to change the credentials again, 
the secrets _would_ be cached, and the configurations would appear different.

If you shortened the TTL (or equivalently, reduced the rate at which the config 
can change) I think it would trigger the task restarts, because it would have 
the opportunity to regenerate task configs with the old secret, and then later 
notice the difference.

This sequence of operations:
 # task config is evaluated
 # connector scheduled to restart
 # time passes
 # connector evaluates the old config
 # connector evaluates the new config
 # connector scheduled to restart
 # time passes
 # connector evaluates the new config
 # new and old configs look different
 # connector publishes new task configs
 # tasks are restarted

Is not the "right" way that restarts should be happening. It's the _only_ way 
they're happening right now, and it's only happening sometimes. The right flow, 
which is what KAFKA-17627 is proposing:
 # task config is evaluated
 # task is scheduled to restart
 # time passes
 # task restarts

Please continue to explore the code and familiarize yourself, as I am happy to 
answer questions. But I wouldn't stress about investigating this as a bug until 
KAFKA-17627 is fixed.

> possible race condition in TTL credentials when connectors recycled on single 
> node instance
> -------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-17704
>                 URL: https://issues.apache.org/jira/browse/KAFKA-17704
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 3.7.1
>            Reporter: Doug Whitfield
>            Priority: Minor
>         Attachments: For_community.zip.001, For_community.zip.002, 
> For_community.zip.003, For_community.zip.004, For_community.zip.005, 
> For_community.zip.006, For_community.zip.007, 
> image-2024-10-07-11-17-41-951.png, image-2024-10-08-19-10-13-215.png, 
> image-2024-10-09-17-04-31-915.png, logstoupload.log
>
>
> This is related to https://issues.apache.org/jira/browse/KAFKA-9228 and 
> https://issues.apache.org/jira/browse/KAFKA-17627 but in single node instance 
> and only related to credentials (as far as we know currently), so maybe 
> something else is in play?
> In some cases, when TTL is used with a single node, passwords are not passed 
> properly.
> In the "logstoupload.log" file you can see that at 09:14 the password does 
> not get change, but at 09:24 it does get changed.
> We are able to "reliably" reproduce this in prod-like environment where this 
> log comes from in Kubernetes, but we have only captured this "race condition" 
> in test rarely where we are not using Kubernetes. We have seen it without 
> Kubernetes though.
> We hope to provide something more reproducible next week, but perhaps 
> uploading this "full" log will allow you to guide us so we can make this more 
> reproducible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-17704) possible race condition in TTL credentials when connectors recycled on single node instance

Reply via email to