lhotari opened a new issue #14329:
URL: https://github.com/apache/pulsar/issues/14329


   <!---
   Instructions for creating a PIP using this issue template:
   
    1. The author(s) of the proposal will create a GitHub issue ticket using 
this template.
       (Optionally, it can be helpful to send a note discussing the proposal to
       [email protected] mailing list before submitting this GitHub issue. 
This discussion can
       help developers gauge interest in the proposed changes before 
formalizing the proposal.)
    2. The author(s) will send a note to the [email protected] mailing list
       to start the discussion, using subject prefix `[PIP] xxx`. To determine 
the appropriate PIP
       number `xxx`, inspect the mailing list 
(https://lists.apache.org/[email protected])
       for the most recent PIP. Add 1 to that PIP's number to get your PIP's 
number.
    3. Based on the discussion and feedback, some changes might be applied by
       the author(s) to the text of the proposal.
    4. Once some consensus is reached, there will be a vote to formally approve
       the proposal. The vote will be held on the [email protected] 
mailing list. Everyone
       is welcome to vote on the proposal, though it will considered to be 
binding
       only the vote of PMC members. It will be required to have a lazy 
majority of
       at least 3 binding +1s votes. The vote should stay open for at least 48 
hours.
    5. When the vote is closed, if the outcome is positive, the state of the
       proposal is updated and the Pull Requests associated with this proposal 
can
       start to get merged into the master branch.
   
   -->
   
   ## Motivation
   
   Since Pulsar Admin API uses the blocking servlet API, all Jetty threads 
might be occupied and this causes unavailability on the Pulsar Admin API. The 
default value for the maximum number of threads for Jetty is too low in Pulsar. 
That is the root cause of many problems where Pulsar Admin API is unavailable 
when all threads are in use.
   
   ## Additional context
   
   - Examples of previous issues where Jetty threads have been occupied and 
caused problems: #13666 #4756 #10619
   - Mailing list thread about "make async" changes: 
https://lists.apache.org/thread/tn7rt59cd1k724l4ytfcmzx1w2sbtw7l
   
   ## Implementation
   
   - Jetty defaults to 200 maximum threads, to prevent thread pool starvation. 
Make Pulsar use the same default value by setting `numHttpServerThreads=200`.
   - Update the documentation for `numHttpServerThreads`
     - The PR is already in place: https://github.com/apache/pulsar/pull/14320
   - Set Jetty selectors and acceptors parameters to `-1` so that Jetty 
automatically chooses optimal values based on available cores. The rationale is 
explained in the Q&A below.
     - A separate PR will be made for this change.
   
   ## Q&A
   
   ### Q: What's the reason of setting the default value to 200? If the node 
just have one core, what will happen? 
   
   These are threads. Jetty defaults to 200 maximum threads, to prevent thread 
pool starvation. This is recommended when using blocking Servlet API. The 
problem is that Pulsar uses the blocking servlet API and doesn't have a 
sufficient number of threads which are needed and recommended.
   
   The value 200 doesn't mean that there will be 200 threads to start with. 
This is the maximum size for the thread pool. When the value is more than 8, 
Jetty will start with 8 initial threads and add more threads to the pool when 
all threads are occupied.
   
   ### Q: Do we need to take the number of system cores into consideration for 
the maximum threads of the thread pool?
   
   No. Jetty is different from Netty in this aspect. In Netty, everything 
should be asynchronous and "thou shall never block". In Jetty, the maximum 
number of threads for the thread pool should be set to 50-500 threads and 
blocking operations are fine. 
   
   The recommendation for the thread pool is explained in Jetty documentation 
https://www.eclipse.org/jetty/documentation/jetty-9/index.html#_thread_pool
   > Thread Pool
   > Configure with goal of limiting memory usage maximum available. Typically 
this is >50 and <500
   
   However, there are separate settings which should take the number of 
available processors (cores) into account in Jetty. 
   
   http port acceptor and selector count:
   
https://github.com/apache/pulsar/blob/b540523b474e4194e30c1acab65dfafdd11d3210/pulsar-broker/src/main/java/org/apache/pulsar/broker/web/WebService.java#L88
   
   https port acceptor and selector count:
   
https://github.com/apache/pulsar/blob/b540523b474e4194e30c1acab65dfafdd11d3210/pulsar-broker/src/main/java/org/apache/pulsar/broker/web/WebService.java#L125
   
   Jetty [documentantion for 
acceptors](https://www.eclipse.org/jetty/documentation/jetty-9/index.html#_acceptors):
   > Acceptors
   > The standard rule of thumb for the number of Accepters to configure is one 
per CPU on a given machine.
   
   Jetty [documentation for 
selectors](https://www.eclipse.org/jetty/javadoc/jetty-9/org/eclipse/jetty/server/ServerConnector.html):
   > Selectors
   > The default number of selectors is equal to half of the number of 
processors available to the JVM, which should allow optimal performance even if 
all the connections used are performing significant non-blocking work in the 
callback tasks.
   
   The settings in jetty are the "acceptor" and "selector" thread count 
settings. These have been fixed to 1 in Pulsar.
   The `acceptors` and `selectors` settings should be both set to -1. Jetty 
would pick the recommended count based on cores in that case. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to