roman-popenov opened a new pull request #6158: [Issue-5994][helm]: Start proxy 
pods when at least one broker pod is running
URL: https://github.com/apache/pulsar/pull/6158
 
 
   ### Motivation
   Fixes #5994:
   If the proxy service comes up before the brokers are up and reachable there 
will be HTTP 403 when running `bin/pulsar-admin` commands from inside the proxy 
pod.
    
   The proxy will also not be able to connect to the brokers when data is 
pushed through binary port with the following error:
   ```bash
   Caused by: 
org.apache.pulsar.broker.service.BrokerServiceException$PersistenceException: 
org.apache.bookkeeper.mledger.ManagedLedgerException: Not enough non-faulty 
bookies available
        ... 14 more
   Caused by: org.apache.bookkeeper.mledger.ManagedLedgerException: Not enough 
non-faulty bookies available
   22:11:07.633 [pulsar-web-32-6] INFO  org.eclipse.jetty.server.RequestLog - 
172.17.0.6 - - [24/Jan/2020:22:11:07 +0000] "PUT 
/admin/v2/persistent/public/functions/assignments HTTP/1.1" 500 2528 "-" 
"Pulsar-Java-v2.5.0" 280
   ```
   
   #### Workaround:
   Restart the proxy pods once brokers pods are running
   
   #### Proposed solution:
   Hold off starting of the proxies until at least one broker is reachable in 
the cluster. 
   
   ### Modifications
   
   Changes are inside `proxy-deployment.yaml` helm template file that defines a 
new init container before proxy is started. The init container waits until 
broker is reachable using the nslookup on the broker service with a sleep of 30 
seconds between retries and up to number of brokers times.
   
   Alternative solution that doesn't always work was `'until nslookup 
broker-service; sleep 2; done;', but 403 would still sometimes (could have been 
a fluke, but I saw it happening once).
   
   ### Verifying this change
   1) Follow the instructions on how deploying helm and run:
   `helm install pulsar --values pulsar/values-mini.yaml ./pulsar/`. 
   2) Wait until all the services are up and running.  
   3) Connect to proxy pod and run `bin/pulsar-admin broker-stats 
monitoring-metrics` - no 403 or permission errors should arise
   4) Set up tenant, namespace
   5) Push data into a topic - No errors in the proxy logs and client is able 
to push data into cluster through proxies
   
   This change should already covered by existing tests?
   
   #### Modules affected:
   The changes in the PR are affecting the deployment using the helm charts. 
Now the pulsar Proxy pods will be started only when at least one broker pod is 
Running and reachable.
   
   ### Documentation
   Currently there is no detailed order of pod startup documented anywhere and 
why. It would be good to document this.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to