nodece opened a new pull request, #23515:
URL: https://github.com/apache/pulsar/pull/23515
### Motivation
`org.apache.pulsar.client.api.BrokerServiceLookupTest#testLookupConnectionNotCloseIfGetUnloadingExOrMetadataEx`
test failed locally and `ascentstream/pulsar` ci:
```
2024-10-26T01:32:04,373 - INFO - [pulsar-client-io-37-3:ConnectionHandler]
- [persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e] [s1]
Closed connection [id: 0x14fd4828, L:/127.0.0.1:49376 -
R:localhost/127.0.0.1:49366] -- Will try again in 0.1 s, hostUrl: null
2024-10-26T01:32:04,378 - INFO -
[bookkeeper-ml-scheduler-OrderedScheduler-2-0:ManagedCursorImpl] -
[public/default/persistent/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e][s1] Closed
cursor at md-position=7:-1
2024-10-26T01:32:04,379 - INFO -
[bookkeeper-ml-scheduler-OrderedScheduler-2-0:PersistentTopic] -
[persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e] Topic
closed
2024-10-26T01:32:04,475 - INFO - [pulsar-timer-75-1:ConnectionHandler] -
[persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e]
[test-0-0] Reconnecting after 0.1 s timeout, hostUrl: null
2024-10-26T01:32:04,475 - INFO - [pulsar-timer-75-1:ConnectionHandler] -
[persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e] [s1]
Reconnecting after 0.1 s timeout, hostUrl: null
2024-10-26T01:32:04,476 - INFO - [metadata-store-2-1:ResourceLockImpl] -
Acquired resource lock on /namespace/public/default/0x00000000_0xffffffff
2024-10-26T01:32:04,476 - INFO - [metadata-store-2-1:ResourceLockImpl] -
Successfully re-acquired missing lock at
/namespace/public/default/0x00000000_0xffffffff
2024-10-26T01:32:04,476 - INFO - [metadata-store-2-1:ResourceLockImpl] -
Successfully revalidated the lock on
/namespace/public/default/0x00000000_0xffffffff
2024-10-26T01:32:04,488 - INFO - [pulsar-client-io-37-3:ConsumerImpl] -
[persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e][s1]
Subscribing to topic on cnx [id: 0x14fd4828, L:/127.0.0.1:49376 -
R:localhost/127.0.0.1:49366], consumerId 0
2024-10-26T01:32:04,488 - INFO - [pulsar-client-io-37-3:ProducerImpl] -
[persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e]
[test-0-0] Creating producer on cnx [id: 0x14fd4828, L:/127.0.0.1:49376 -
R:localhost/127.0.0.1:49366]
2024-10-26T01:32:04,490 - INFO - [pulsar-io-8-4:ServerCnx] - [[id:
0x4b4d1182, L:/127.0.0.1:49366 - R:/127.0.0.1:49376] [SR:127.0.0.1,
state:Connected]] Subscribing on topic
persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e / s1.
consumerId: 0, role: null
2024-10-26T01:32:04,493 - WARN - [pulsar-io-8-4:BrokerService] - Namespace
bundle for topic
(persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e) not
served by this instance:localhost:49367. Please redo the lookup. Request is
denied: namespace=public/default
2024-10-26T01:32:04,494 - WARN - [pulsar-io-8-4:ServerCnx] -
[/127.0.0.1:49376][persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e][s1]
Failed to create consumer: consumerId=0, Namespace bundle for topic
(persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e) not
served by this instance:localhost:49367. Please redo the lookup. Request is
denied: namespace=public/default
2024-10-26T01:32:04,499 - WARN - [pulsar-io-8-4:BrokerService] - Namespace
bundle for topic
(persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e) not
served by this instance:localhost:49367. Please redo the lookup. Request is
denied: namespace=public/default
2024-10-26T01:32:04,514 - WARN - [pulsar-client-io-37-3:ClientCnx] - [id:
0x14fd4828, L:/127.0.0.1:49376 - R:localhost/127.0.0.1:49366] Received error
from server: Namespace bundle for topic
(persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e) not
served by this instance:localhost:49367. Please redo the lookup. Request is
denied: namespace=public/default
2024-10-26T01:32:04,514 - WARN - [pulsar-client-io-37-3:ConsumerImpl] -
[persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e][s1]
Failed to subscribe to topic on localhost/127.0.0.1:49366
2024-10-26T01:32:04,514 - WARN - [pulsar-client-io-37-3:ConnectionHandler]
- [persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e] [s1]
Error connecting to broker:
org.apache.pulsar.client.api.PulsarClientException$LookupException:
{"errorMsg":"Namespace bundle for topic
(persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e) not
served by this instance:localhost:49367. Please redo the lookup. Request is
denied: namespace=public/default","reqId":850604948710385036,
"remote":"localhost/127.0.0.1:49366", "local":"/127.0.0.1:49376"}
2024-10-26T01:32:04,515 - WARN - [pulsar-client-io-37-3:ConnectionHandler]
- [persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e] [s1]
Could not get connection to broker:
org.apache.pulsar.client.api.PulsarClientException$LookupException:
{"errorMsg":"Namespace bundle for topic
(persistent://public/default/tp-359a79af-7dbe-4aa6-9e2a-6a1af63c4d8e) not
served by this instance:localhost:49367. Please redo the lookup. Request is
denied: namespace=public/default","reqId":850604948710385036,
"remote":"localhost/127.0.0.1:49366", "local":"/127.0.0.1:49376"} -- Will try
again in 0.199 s
```
In the lookup operation, if the ownership cannot be found in the
`ownedBundlesCache` but exists in the metadata store, the broker will directly
read the ownership from the metadata store by the `LockManager`, please see
`org.apache.pulsar.broker.namespace.OwnershipCache#getOwnerAsync`.
When the broker creates the producer, it will check if the topic ownership
exists on the cache, if not found, print `Please redo the lookup. Request is
denied: namespace=public/default`, and causes the client to loop through
reconnecting to the current broker.
I'm not sure if the issue is only in the pulsar testing environment, because
we directly delete the zk data and cache data in the case.
### Modifications
- If the ownership belongs to the current broker in the `getOwnerAsync`, the
broker will try to acquire ownership by the `ownedBundlesCache` to avoid the
ownership loss in the cache.
### Documentation
<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
- [ ] `doc` <!-- Your PR contains doc changes. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update
later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]