Hello,

I too have my java server completely crashing when using SSL HTTPS when
someone sends a malformed http request...  It's a Java problem, not a netty
one...  I forget the exact problem, but you can harden your machine by
using the lowest TLS like 1.3 or something... I've been able to reduce the
number of malformed requests that shut down my java program, but I'm not
sure I'm 100% done with em... You gotta play with your Linux root and all
sorts of things... Malformed HTTPS requests should be handled by exception
properly, but often are not and crash the system.  It's ugly and a known
issue... If this is the issue you have.

,Jim

On Wed, Sep 27, 2023 at 12:04 PM Chris Bailey <[email protected]> wrote:

> Hi,
>
>               We’re leveraging netty (v4.1.92) via Vertx (v4.3.8)
> configured to use OpenJdkSSLEngine with a deletegatedTaskExecutor() when
> creating the ssl_handler for our webserver.  In general, we see SSL
> operations (e.g., certificate validation) occurring on a worker thread, but
> we have run into a deadlock due to our current implementation and wanted
> some feedback on options.
>
>
>
> The crux of the issue is that during our certificate validation via our
> custom TrustManager, we need to communicate with another internal component
> (which is running on a typical event loop thread) to complete the
> validation processing, but other SSL operations block the event loop on
> which that internal service is running by grabbing a lock.  Its my
> understanding that not all blocking SSL operations are offloaded, so there
> are likely more cases like these floating around.
>
>
>
> What we are specifically seeing is that while we’re waiting on a response
> from the internal component to validate the certificate, a simultaneous
> request to close a TLS connection comes in.  The closing of the session is
> running on the event loop and blocks by waiting on a lock, but
> unfortunately, that lock is held by the worker thread waiting on a response
> from the internal component.  IOW, the internal service cannot respond to
> the worker thread because it its event loop is blocked by the close
> connection processing.  At this point, our entire webserver is dead in the
> water until we restart the JVM.
>
>
>
> Sample traces are listed below from this example:
>
>    - Close connection processing is running on event loop thread 0 and is
>    blocked trying to grab a lock that is held by a worker thread
>    - Cert validation in the TrustManager is running on a worker thread
>    and is awaiting a response from the blocked service running on event loop
>    thread 0
>    - Internal service to complete the cert validation is affined to event
>    loop thread 0, but its blocked because of the close connection processing
>
>
>
> We’ve looked at several options:
>
>    - Detect the deadlock and fail the client authentication.  The issue
>    here is bad user experience so we’d like to avoid it.
>    - Segregate the our TLS handling in the webserver and the blocking SSL
>    handling into two DIFFERENT worker threads so that that we never block an
>    event loop.  This is a risker change for us, though.
>    - Remove the need to communicate with the internal service running on
>    an event loop thread.
>
>
>
> Any thoughts on other/better ways to handle this?
>
>
>
> Thanks in advance,
>
> Chris
>
>
>
> # close connection handler running on event loop thread 0 is blocked
>
> Sep 15 10:13:43.470794 APM00203802612-A control-path[37578]:
> "vert.x-eventloop-thread-0" #25 prio=5 os_prio=0 cpu=595313.19ms
> elapsed=28293.78s tid=0x00007f5bb8021800 nid=0x22c waiting for monitor
> entry  [0x00007f5c0034d000]
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:
>  java.lang.Thread.State: BLOCKED (on object monitor)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> sun.security.ssl.SSLEngineImpl.closeOutbound(
> [email protected]/SSLEngineImpl.java:749)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         -
> waiting to lock <0x00000000dca1d0f8> (a sun.security.ssl.SSLEngineImpl)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.handler.ssl.SslHandler.setHandshakeFailure(SslHandler.java:1874)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.handler.ssl.SslHandler.channelInactive(SslHandler.java:1078)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:305)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:281)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:274)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:301)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:281)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:813)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:566)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>
> Sep 15 10:13:43.471494 APM00203802612-A control-path[37578]:         at
> java.lang.Thread.run([email protected]/Thread.java:829)
>
>
>
>
>
> # Trust manager waiting on a response from our internal component which is
> running on event loop thread 0.  Note that the lock that the close
> connection handler is waiting on is held by this thread.
> Sep 15 10:13:43.479257 APM00203802612-A control-path[37578]:
> "vert.x-internal-blocking-1" #42 prio=5 os_prio=0 cpu=23498.06ms
> elapsed=28290.74s tid=0x00007f5bb02b1000 nid=0x23b waiting on condition
>  [0x00007f5bc4e59000]Sep 15 10:13:43.479257 APM00203802612-A
> control-path[37578]:    java.lang.Thread.State: WAITING (parking)Sep 15
> 10:13:43.479257 APM00203802612-A control-path[37578]:         at
> jdk.internal.misc.Unsafe.park([email protected]/Native Method)Sep 15
> 10:13:43.479257 APM00203802612-A control-path[37578]:         - parking to
> wait for  <0x00000000dca24ec0> (a java.util.concurrent.CountDownLatch$Sync)Sep
> 15 10:13:43.479257 APM00203802612-A control-path[37578]:         at
> java.util.concurrent.locks.LockSupport.park(
> [email protected]/LockSupport.java:194)Sep 15 10:13:43.479257
> APM00203802612-A control-path[37578]:         at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(
> [email protected]/AbstractQueuedSynchronizer.java:885)Sep 15
> 10:13:43.479257 APM00203802612-A control-path[37578]:         at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(
> [email protected]/AbstractQueuedSynchronizer.java:1039)Sep 15
> 10:13:43.479257 APM00203802612-A control-path[37578]:         at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(
> [email protected]/AbstractQueuedSynchronizer.java:1345)Sep 15
> 10:13:43.479257 APM00203802612-A control-path[37578]:         at
> java.util.concurrent.CountDownLatch.await(
> [email protected]/CountDownLatch.java:232)Sep 15 10:13:43.479257
> APM00203802612-A control-path[37578]:         at
> io.reactivex.rxjava3.internal.observers.BlockingMultiObserver.blockingGet(BlockingMultiObserver.java:86)Sep
> 15 10:13:43.479257 APM00203802612-A control-path[37578]:         at
> io.reactivex.rxjava3.core.Single.blockingGet(Single.java:3645)Sep 15
> 10:13:43.479257 APM00203802612-A control-path[37578]:         at
> com.emc.cyclone.contexts.security.credstore.CycTrustOptions$CycTrustManager.validateCertificateWithCmfa(CycTrustOptions.java:442)
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Netty discussions" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/netty/8f31febf-5aaa-47b5-8a37-b71d63edf6acn%40googlegroups.com
> <https://groups.google.com/d/msgid/netty/8f31febf-5aaa-47b5-8a37-b71d63edf6acn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Netty discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/netty/CADMLN6bFm%2BCNEP9SdeCrZspFTQCVJnRgkvUY21p%2BP0xBp58TAw%40mail.gmail.com.

Reply via email to