I can't understand how deadlock happens, can you explain it in detail ?

On 2022/02/17 03:00:36 PengHui Li wrote:
> Hi Lari,
> 
> Thanks for raising a proposal and starting the discussion thread here.
> Increasing the thread does not solve the underlying problem.
> but also brings additional resource consumption.
> 
> Thanks, Matteo provided a great explanation,
> and, I totally agree with his explanation.
> What should we try to optimize thread usage,
> reduce block calls, rather than going in the opposite direction.
> 
> For the current REST API implementation,
> I saw 2 main problems
> 
> 1. The API defined we are using the Jetty async mode, but
> is actually a blocking call. I think 100% this is bad code,
> confuse contributors, and bring potential risks
> 2. We don't know the exact return type from the interface definition [1]
> until reading the implementation code here[2]
> 
> So, I think to change the current implementation to purely async
> and avoid passing the `AsyncResponse` to the implementation
> is the right way to improve the REST API and fix the potential problems.
> 
> [1]
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/admin/v2/PersistentTopics.java#L1073
> [2]
> https://github.com/apache/pulsar/blob/df9a12dc90316e0d5c1d9effb416ec83fbcf9b5e/pulsar-broker/src/main/java/org/apache/pulsar/broker/admin/impl/PersistentTopicsBase.java#L1259
> 
> Thanks,
> Penghui
> 
> On Thu, Feb 17, 2022 at 10:00 AM Matteo Merli <[email protected]>
> wrote:
> 
> > Hi Lari,
> >
> > thanks for the proposal, though I don't think it would be good to
> > change this default value.
> >
> > Pulsar HTTP API is an administrative service and it is not on the
> > critical data path.
> > If all the threads in the pool are busy, it is good to apply back
> > pressure and slow down the incoming new requests.
> >
> > In the past, we had several cases of deadlocks involving HTTP Jetty
> > threads. In all these cases, the root cause has been a broker that in
> > response to an HTTP request is spawning HTTP requests to other
> > brokers, and some of these requests are going to itself.
> >
> > These kinds of deadlock are unrelated to the number of HTTP threads in
> > the pool and must be dealt with by making the underlying operation
> > asynchronous.
> >
> > Finally each user is free to change the default to something that
> > suits more, though I really don't think it makes for us to default to
> > 200 threads as:
> >   1. It will not improve the Pulsar performance
> >   2. It will not magically solve any deadlock issue
> >   3. It will just consume much more resources for everyone
> >
> >
> > Replying in-line to some of the points.
> >
> > >
> > > Additional context
> > >
> > > -   Examples of previous issues where Jetty threads have been occupied
> > >     and caused problems: #13666 #4756 #10619
> >
> > All these examples are cases of what I have described above: an HTTP
> > handler triggering a call to the same HTTP server resulting in a
> > deadlock.
> >
> > Having had more threads wouldn't have fixed them.
> >
> > > -   Mailing list thread about “make async” changes:
> > >    https://lists.apache.org/thread/tn7rt59cd1k724l4ytfcmzx1w2sbtw7l
> >
> > The rationale for these changes has not much to do with Jetty but
> > rather to make sure that we are not mixing blocking and non-blocking
> > operations when handling these requests, which is prone to cause
> > deadlocks in broker's internal threads (irrespective of any Jetty
> > thread pool size).
> >
> > > Q: What’s the reason of setting the default value to 200? If the node
> > just have one core, what will happen?
> > >
> > > These are threads. Jetty defaults to 200 maximum threads, to prevent
> > > thread pool starvation. This is recommended when using blocking Servlet
> > > API. The problem is that Pulsar uses the blocking servlet API and
> > > doesn’t have a sufficient number of threads which are needed and
> > > recommended.
> >
> > We need to keep in mind that typical usage of Jetty is as a Servlet
> > container, dedicated to that. You have servlets doing disk IO reads
> > and blocking calls on JDBC drivers.
> >
> > Pulsar usage is *very* different from that. Admin API is a tiny
> > portion of what a broker does. Dedicating 200 threads to that would be
> > a massive waste of resources (CPU & memory).
> >
> > > The value 200 doesn’t mean that there will be 200 threads to start with.
> > > This is the maximum size for the thread pool. When the value is more
> > > than 8, Jetty will start with 8 initial threads and add more threads to
> > > the pool when all threads are occupied.
> > >
> > > Q: Do we need to take the number of system cores into consideration for
> > the maximum threads of the thread pool?
> > >
> > > No. Jetty is different from Netty in this aspect. In Netty, everything
> > > should be asynchronous and “thou shall never block”. In Jetty, the
> > > maximum number of threads for the thread pool should be set to 50-500
> > > threads and blocking operations are fine.
> >
> > I disagree. In a container configured with 0.5 CPU cores, it would be
> > very bad to run 200 threads, and it will have no other advantage
> > whatsoever.
> >
> > That is the reason why the default thread pool size is pegged to the
> > number of cores.
> >
> > > The recommendation for the thread pool is explained in Jetty
> > > documentation
> > >
> > https://www.eclipse.org/jetty/documentation/jetty-9/index.html#_thread_pool
> > > > Thread Pool > Configure with goal of limiting memory usage maximum
> > > available. Typically this is >50 and <500
> >
> > The recommendations for Jetty are relative to typical Jetty workloads,
> > something the Pulsar broker is definitely not.
> >
> 

Reply via email to