pqab opened a new issue, #21933:
URL: https://github.com/apache/pulsar/issues/21933

   ### Search before asking
   
   - [X] I searched in the [issues](https://github.com/apache/pulsar/issues) 
and found nothing similar.
   
   
   ### Version
   
   2.10.5
   
   ### Minimal reproduce step
   
   1. Create topic
   
   ```
   bin/pulsar-admin tenants create tenant1
   bin/pulsar-admin namespaces create tenant1/namespace1
   bin/pulsar-admin namespaces set-persistence --bookkeeper-ack-quorum 2 
--bookkeeper-ensemble 3 --bookkeeper-write-quorum 3 --ml-mark-delete-max-rate 0 
tenant1/namespace1
   bin/pulsar-admin namespaces set-retention tenant1/namespace1 --size -1 
--time 3d
   bin/pulsar-admin namespaces set-message-ttl tenant1/namespace1 --messageTTL 
604800
   bin/pulsar-admin topics create-partitioned-topic tenant1/namespace1/topic1 
-p 3
   ```
   
   2. Produce large payload & batch from the admin tool with tls
   
   ```
   bin/pulsar-perf produce persistent://tenant1/namespace1/topic1 -mk 
autoIncrement -bb 5242880 -r 5000 -s 5242 -bm 1000 -threads 30 --auth-plugin 
org.apache.pulsar.client.impl.auth.AuthenticationTls --auth-params 
'{"tlsCertFile":"conf/user.cer","tlsKeyFile":"conf/user.key.pem"}'
   ```
   
   3. Stop until it produced around 1 million messages
   
   4. Wait until all the messages goes to BookKeeper backlog
   
   5. Start consumer to consume all the messages with tls
   
   ```
   bin/pulsar-perf  consume persistent://tenant1/namespace1/topic1 
--auth-plugin org.apache.pulsar.client.impl.auth.AuthenticationTls 
--auth-params 
'{"tlsCertFile":"conf/user.cer","tlsKeyFile":"conf/user.key.pem"}' -sp Earliest 
-ss sub1
   ```
   
   ### What did you expect to see?
   
   Able to consume all produced messages properly from the consumer
   
   ### What did you see instead?
   
   Consumer stopped receiving msg in the middle, and could see some error from 
the broker logs like
   
   ```
   2024-01-19T14:05:39,899+0000 [BookKeeperClientWorker-OrderedExecutor-4-0] 
ERROR org.apache.bookkeeper.proto.checksum.DigestManager - Mac mismatch for 
ledger-id: 852, entry-id: 35932
   2024-01-19T14:05:39,902+0000 [BookKeeperClientWorker-OrderedExecutor-4-0] 
ERROR org.apache.bookkeeper.proto.checksum.DigestManager - Mac mismatch for 
ledger-id: 852, entry-id: 35932
   2024-01-19T14:05:39,916+0000 [BookKeeperClientWorker-OrderedExecutor-4-0] 
ERROR org.apache.bookkeeper.proto.checksum.DigestManager - Mac mismatch for 
ledger-id: 852, entry-id: 35932
   2024-01-19T14:05:39,916+0000 [BookKeeperClientWorker-OrderedExecutor-4-0] 
ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: 
L852 E35899-E35998, Sent to [100.87.157.209:3181, 100.111.147.236:3181, 
100.96.184.253:3181], Heard from [100.87.157.209:3181, 100.111.147.236:3181, 
100.96.184.253:3181] : bitset = {0, 1, 2}, Error = 'Entry digest does not 
match'. First unread entry is (35973, rc = 0)
   2024-01-19T14:05:39,916+0000 [broker-topic-workers-OrderedExecutor-15-0] 
ERROR 
org.apache.pulsar.broker.service.persistent.PersistentDispatcherSingleActiveConsumer
 - [persistent://tenant1/namespace1/topic1-0 / 
sub1-Consumer{subscription=PersistentSubscription{topic=persistent://tenant1/namespace1/topic1-0,
 name=sub1}, consumerId=0, consumerName=383fd, address=/100.96.184.253:50090}] 
Error reading entries at 852:35899 : Entry digest does not match - Retrying to 
read in 15.0 seconds
   ```
   
   ### Anything else?
   
   Seems only happening when there is SSL exception in the middle of the 
produce like
   
   ```
   2024-01-19T13:39:13,450+0000 [pulsar-client-io-12-1] WARN  
org.apache.pulsar.client.impl.ClientCnx - Got exception 
io.netty.handler.codec.DecoderException: 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine$OpenSslException: 
error:100003fc:SSL routines:OPENSSL_internal:SSLV3_ALERT_BAD_RECORD_MAC
        at 
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:499)
        at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at 
io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:800)
        at 
io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:499)
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:397)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:829)
   Caused by: 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine$OpenSslException: 
error:100003fc:SSL routines:OPENSSL_internal:SSLV3_ALERT_BAD_RECORD_MAC
        at 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine.newSSLExceptionForError(ReferenceCountedOpenSslEngine.java:1377)
        at 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine.shutdownWithError(ReferenceCountedOpenSslEngine.java:1089)
        at 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine.sslReadErrorResult(ReferenceCountedOpenSslEngine.java:1399)
        at 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1325)
        at 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1426)
        at 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1469)
        at 
io.netty.handler.ssl.SslHandler$SslEngineType$1.unwrap(SslHandler.java:223)
        at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1353)
        at 
io.netty.handler.ssl.SslHandler.decodeNonJdkCompatible(SslHandler.java:1257)
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1297)
        at 
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
        at 
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
        ... 15 more
   ```
   
   or
   
   ```
   2024-01-19T14:01:02,532+0000 [pulsar-client-io-6-1] WARN  
org.apache.pulsar.client.impl.ClientCnx - Got exception 
io.netty.handler.codec.DecoderException: 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine$OpenSslException: 
error:10000438:SSL routines:OPENSSL_internal:TLSV1_ALERT_INTERNAL_ERROR
        at 
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:499)
        at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at 
io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:800)
        at 
io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:499)
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:397)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:829)
   Caused by: 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine$OpenSslException: 
error:10000438:SSL routines:OPENSSL_internal:TLSV1_ALERT_INTERNAL_ERROR
        at 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine.newSSLExceptionForError(ReferenceCountedOpenSslEngine.java:1377)
        at 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine.shutdownWithError(ReferenceCountedOpenSslEngine.java:1089)
        at 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine.sslReadErrorResult(ReferenceCountedOpenSslEngine.java:1399)
        at 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1325)
        at 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1426)
        at 
io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1469)
        at 
io.netty.handler.ssl.SslHandler$SslEngineType$1.unwrap(SslHandler.java:223)
        at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1353)
        at 
io.netty.handler.ssl.SslHandler.decodeNonJdkCompatible(SslHandler.java:1257)
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1297)
        at 
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
        at 
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
        ... 15 more
   ```
   
   ### Are you willing to submit a PR?
   
   - [ ] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to