Yu Yang created KAFKA-7364:
------------------------------

             Summary: kafka periodically run into high cpu usage with ssl 
writing
                 Key: KAFKA-7364
                 URL: https://issues.apache.org/jira/browse/KAFKA-7364
             Project: Kafka
          Issue Type: Bug
          Components: core
    Affects Versions: 2.0.0
            Reporter: Yu Yang


while testing ssl writing to kafka, we found that kafka often run into high cpu 
usage due to inefficiency in jdk ssl implementation. 

In detail, we use a test cluster that have 12 d2.8xlarge instances, jdk-10.0.2, 
 and hosts only one topic that have ~20k producers write to through ssl 
channel. We observed that  the network threads often get 100% cpu usage after 
enabling ssl writing to kafka.   To improve kafka's throughput, we have 
"num.network.threads=32" for the broker.  Even with 32 network threads, we see 
the broker cpu usage jump right after ssl writing is enabled. 

 !Screen Shot 2018-08-30 at 10.57.32 PM.png! 

When the broker's cpu usage is high, 'perf top' shows that kafka is busy with 
executing code in  libsunec.so.  The following is a sample stack track that we 
get when the broker's cpu usage was high. 

{code}
Thread 77562: (state = IN_NATIVE)
 - sun.security.ec.ECDSASignature.verifySignedDigest(byte[], byte[], byte[], 
byte[]) @bci=0 (Compiled frame; information may be imprecise)
 - sun.security.ec.ECDSASignature.engineVerify(byte[]) @bci=70, line=321 
(Compiled frame)
 - java.security.Signature$Delegate.engineVerify(byte[]) @bci=9, line=1222 
(Compiled frame)
 - java.security.Signature.verify(byte[]) @bci=10, line=655 (Compiled frame)
 - sun.security.x509.X509CertImpl.verify(java.security.PublicKey, 
java.lang.String) @bci=136, line=444 (Compiled frame)
 - 
sun.security.provider.certpath.BasicChecker.verifySignature(java.security.cert.X509Certificate)
 @bci=48, line=166 (Compiled frame)
 - 
sun.security.provider.certpath.BasicChecker.check(java.security.cert.Certificate,
 java.util.Collection) @bci=24, line=147 (Compiled frame)
 - 
sun.security.provider.certpath.PKIXMasterCertPathValidator.validate(java.security.cert.CertPath,
 java.util.List, java.util.List) @bci=316, line=125 (Compiled frame)
 - 
sun.security.provider.certpath.PKIXCertPathValidator.validate(java.security.cert.TrustAnchor,
 sun.security.provider.certpath.PKIX$ValidatorParams) @bci=390, line=233 
(Compiled frame)
 - 
sun.security.provider.certpath.PKIXCertPathValidator.validate(sun.security.provider.certpath.PKIX$ValidatorParams)
 @bci=217, line=141 (Compiled frame)
 - 
sun.security.provider.certpath.PKIXCertPathValidator.engineValidate(java.security.cert.CertPath,
 java.security.cert.CertPathParameters) @bci=7, line=80 (Compiled frame)
 - java.security.cert.CertPathValidator.validate(java.security.cert.CertPath, 
java.security.cert.CertPathParameters) @bci=6, line=292 (Compiled frame)
 - 
sun.security.validator.PKIXValidator.doValidate(java.security.cert.X509Certificate[],
 java.security.cert.PKIXBuilderParameters) @bci=34, line=357 (Compiled frame)
 - 
sun.security.validator.PKIXValidator.engineValidate(java.security.cert.X509Certificate[],
 java.util.Collection, java.security.AlgorithmConstraints, java.lang.Object) 
@bci=232, line=259 (Compiled frame)
 - 
sun.security.validator.Validator.validate(java.security.cert.X509Certificate[], 
java.util.Collection, java.security.AlgorithmConstraints, java.lang.Object) 
@bci=6, line=260 (Compiled frame)
 - 
sun.security.ssl.X509TrustManagerImpl.validate(sun.security.validator.Validator,
 java.security.cert.X509Certificate[], java.security.AlgorithmConstraints, 
java.lang.String) @bci=10, line=324 (Compiled frame)
 - 
sun.security.ssl.X509TrustManagerImpl.checkTrusted(java.security.cert.X509Certificate[],
 java.lang.String, javax.net.ssl.SSLEngine, boolean) @bci=179, line=279 
(Compiled frame)
 - 
sun.security.ssl.X509TrustManagerImpl.checkClientTrusted(java.security.cert.X509Certificate[],
 java.lang.String, javax.net.ssl.SSLEngine) @bci=5, line=130 (Compiled frame)
 - 
sun.security.ssl.ServerHandshaker.clientCertificate(sun.security.ssl.HandshakeMessage$CertificateMsg)
 @bci=190, line=1966 (Compiled frame)
 - sun.security.ssl.ServerHandshaker.processMessage(byte, int) @bci=160, 
line=237 (Compiled frame)
 - sun.security.ssl.Handshaker.processLoop() @bci=96, line=1052 (Compiled frame)
 - sun.security.ssl.Handshaker$1.run() @bci=4, line=992 (Compiled frame)
 - sun.security.ssl.Handshaker$1.run() @bci=1, line=989 (Compiled frame)
 - 
java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction,
 java.security.AccessControlContext) @bci=0 (Compiled frame)
 - sun.security.ssl.Handshaker$DelegatedTask.run() @bci=24, line=1467 (Compiled 
frame)
 - org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks() 
@bci=13, line=393 (Compiled frame)
 - org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(boolean) 
@bci=88, line=473 (Compiled frame)
 - org.apache.kafka.common.network.SslTransportLayer.doHandshake() @bci=570, 
line=331 (Compiled frame)
 - org.apache.kafka.common.network.SslTransportLayer.handshake() @bci=48, 
line=258 (Compiled frame)
 - org.apache.kafka.common.network.KafkaChannel.prepare() @bci=16, line=125 
(Compiled frame)
 - org.apache.kafka.common.network.Selector.pollSelectionKeys(java.util.Set, 
boolean, long) @bci=251, line=494 (Compiled frame)
 - org.apache.kafka.common.network.Selector.poll(long) @bci=312, line=432 
(Compiled frame)
 - kafka.network.Processor.poll() @bci=7, line=678 (Compiled frame)
 - kafka.network.Processor.run() @bci=31, line=583 (Compiled frame)
 - java.lang.Thread.run() @bci=11, line=748 (Interpreted frame)
{code}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to