oneby-wang opened a new pull request, #25889:
URL: https://github.com/apache/pulsar/pull/25889

   ### Motivation
   
   `PulsarFunctionTlsTest.testFunctionsCreation` is flaky when the two function 
workers are in a leadership switchover window. The test can observe a 
worker/leader state that is already stale by the time the create-function 
request is internally forwarded to `/admin/v3/functions/leader/...`.
   
   In that window, the coordination topic can already point to the new leader 
while the new leader is still finishing initialization. The forwarded request 
then fails with a transient `HTTP 503` response: `Leader not yet ready. Please 
retry again`.
   
   Reproduction log snippet:
   
   ```text
   2026-05-29T22:21:35,931 - INFO  - 
[assignment-tailer-thread:FunctionAssignmentTailer] - assignment tailer thread 
exiting {}
   2026-05-29T22:21:35,931 - INFO  - 
[pulsar-external-listener-5215-1:FunctionAssignmentTailer] - Closing function 
assignment tailer {}
   2026-05-29T22:21:35,932 - INFO  - 
[pulsar-external-listener-5215-1:FunctionMetaDataManager] - 
FunctionMetaDataManager becoming leader by creating exclusive producer {}
   ...
   2026-05-29T22:21:36,011 - INFO  - 
[pulsar-web-5184-20:JettyRequestLogFactory] - HTTP request {bytesOut=53, 
clientAddr=127.0.0.1, clientPort=52607, durationMs=3, method=PUT, 
proto=HTTP/1.1, referer=null, status=503, 
uri=https://localhost:52588/admin/v3/functions/leader/my-tenant/my-ns/function-0,
 user=null, userAgent=Pulsar-Java-v5.0.0-M1-SNAPSHOT}
   2026-05-29T22:21:36,012 - ERROR - [pulsar-web-5006-16:ComponentImpl] - 
Failed to update function on leader {error=Update Failed}
   
org.apache.pulsar.client.admin.PulsarAdminException$ServerSideErrorException: 
Leader not yet ready. Please retry again
           at 
org.apache.pulsar.client.admin.PulsarAdminException.wrap(PulsarAdminException.java:252)
           at 
org.apache.pulsar.client.admin.internal.BaseResource.sync(BaseResource.java:366)
           at 
org.apache.pulsar.client.admin.internal.FunctionsImpl.updateOnWorkerLeader(FunctionsImpl.java:706)
   ```
   
   ### Modifications
   
   - Remove the test-side pre-check of the worker leader state before function 
creation.
   - Retry `createFunctionWithUrl` only when the failure is a 
`PulsarAdminException` with status code `503` and `Leader not yet ready` in the 
HTTP error body.
   - Keep all other admin failures visible immediately so TLS, auth, 
validation, or non-transient service errors are not hidden.
   
   ### Verifying this change
   
   This change is already covered by existing tests:
   
   - `./gradlew :pulsar-broker:test --tests 
org.apache.pulsar.functions.worker.PulsarFunctionTlsTest.testFunctionsCreation`
   
   ### Does this pull request potentially affect one of the following parts:
   
   - [ ] Dependencies (add or upgrade a dependency)
   - [ ] The public API
   - [ ] The schema
   - [ ] The default values of configurations
   - [ ] The threading model
   - [ ] The binary protocol
   - [ ] The REST endpoints
   - [ ] The admin CLI options
   - [ ] The metrics
   - [ ] Anything that affects deployment
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to