ivankelly commented on a change in pull request #11627:
URL: https://github.com/apache/pulsar/pull/11627#discussion_r688379620



##########
File path: 
pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java
##########
@@ -243,7 +245,7 @@ protected ConsumerImpl(PulsarClientImpl client, String 
topic, ConsumerConfigurat
         this.initialStartMessageId = this.startMessageId;
         this.startMessageRollbackDurationInSec = 
startMessageRollbackDurationInSec;
         AVAILABLE_PERMITS_UPDATER.set(this, 0);
-        this.subscribeTimeout = System.currentTimeMillis() + 
client.getConfiguration().getOperationTimeoutMs();
+        this.subscribeTimeout = System.currentTimeMillis() + 
client.getConfiguration().getLookupTimeoutMs();

Review comment:
       It also includes the time to do CommandSubscribe and CommandPublish. 
   
   Separating the timeout to do lookup from the time to establish the correct 
connection is a major rework of how timeouts work. The lookup timeout and retry 
is handled in ConsumerImpl and ProducerImpl, and these only get signals via 
connectionFailed and connectionOpen callbacks. So separate it out, we'd need to 
refactor how the Impls get a connection. Currently it goes from 
Impl->ConnectionHandler->PulsarClientImpl->LookupService. I don't think it's 
worth it. It's already clear if subscribe failed due to lookup timeout or a 
connection timeout. The exception returned is different, 
PulsarClientException.TimeoutException for the former, netty 
ConnectTimeoutException for the latter. If you want to know which node you 
failed to connect to, it's there in the exception message.
   ```
   java.util.concurrent.CompletionException: 
org.apache.pulsar.client.api.PulsarClientException: 
java.util.concurrent.CompletionException: 
io.netty.channel.ConnectTimeoutException: connection timed out: 
/192.168.1.34:5432
   ```
   
   w.r.t. including the CommandSubscribe and CommandProducer, I can change 
this, but it would create a behavioral change by default as then the 
operationTimeout for these commands only starts counting down after lookup has 
succeeded. i.e. the whole operation could take twice as long. I guess this 
isn't a major issue though.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to