rdhabalia opened a new issue, #22894:
URL: https://github.com/apache/pulsar/issues/22894

   ### Search before asking
   
   - [X] I searched in the [issues](https://github.com/apache/pulsar/issues) 
and found nothing similar.
   
   
   ### Read release policy
   
   - [X] I understand that unsupported versions don't get bug fixes. I will 
attempt to reproduce the issue on a supported version of Pulsar client and 
Pulsar broker.
   
   
   ### Version
   
   >= 2.10
   
   ### Minimal reproduce step
   
   Suddenly broker log shows seeing below error and connected producers started 
seeing timeout for published messages
   ```
   04:05:33.877 [pulsar-stats-updater-24-1] INFO  
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - 
[pulsar/prod2-gq1b/prod2b-broker23.messaging.gq1.yahoo.com:4080/persistent/sanityTestTopic-1]
    Closing inactive ledger, last-add entry 0
   04:06:33.877 [pulsar-stats-updater-24-1] INFO  
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - 
[pulsar/prod2-gq1b/prod2b-broker23.messaging.gq1.yahoo.com:4080/persistent/sanityTestTopic-0]
    Closing inactive ledger, last-add entry 0
   04:06:33.877 [pulsar-stats-updater-24-1] INFO  
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - 
[pulsar/prod2-gq1b/prod2b-broker23.messaging.gq1.yahoo.com:4080/persistent/sanityTestTopic-2]
    Closing inactive ledger, last-add entry 0
   04:06:33.877 [pulsar-stats-updater-24-1] INFO  
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - 
[pulsar/prod2-gq1b/prod2b-broker23.messaging.gq1.yahoo.com:4080/persistent/sanityTestTopic-1]
    Closing inactive ledger, last-add entry 0
   04:07:26.759 [pulsar-acceptor-23-1] WARN  
io.netty.channel.DefaultChannelPipeline - An exceptionCaught() event was fired, 
and it reached at the tail of the pipeline. It usually means the last han
   dler in the pipeline did not handle the exception.
   io.netty.channel.unix.Errors$NativeIoException: accept(..) failed: Too many 
open files
   04:07:27.760 [pulsar-acceptor-23-1] WARN  
io.netty.channel.DefaultChannelPipeline - An exceptionCaught() event was fired, 
and it reached at the tail of the pipeline. It usually means the last handler 
in the pipeline did not handle the exception.
   io.netty.channel.unix.Errors$NativeIoException: accept(..) failed: Too many 
open files
   04:07:28.760 [pulsar-acceptor-23-1] WARN  
io.netty.channel.DefaultChannelPipeline - An exceptionCaught() event was fired, 
and it reached at the tail of the pipeline. It usually means the last handler 
in the pipeline did not handle the exception.
   io.netty.channel.unix.Errors$NativeIoException: accept(..) failed: Too many 
open files
   04:07:29.760 [pulsar-acceptor-23-1] WARN  
io.netty.channel.DefaultChannelPipeline - An exceptionCaught() event was fired, 
and it reached at the tail of the pipeline. It usually means the last handler 
in the pipeline did not handle the exception.
   io.netty.channel.unix.Errors$NativeIoException: accept(..) failed: Too many 
open files
   04:07:30.761 [pulsar-acceptor-23-1] WARN  
io.netty.channel.DefaultChannelPipeline - An exceptionCaught() event was fired, 
and it reached at the tail of the pipeline. It usually means the last handler 
in the pipeline did not handle the exception.
   io.netty.channel.unix.Errors$NativeIoException: accept(..) failed: Too many 
open files
   04:07:31.761 [pulsar-acceptor-23-1] WARN  
io.netty.channel.DefaultChannelPipeline - An exceptionCaught() event was fired, 
and it reached at the tail of the pipeline. It usually means the last handler 
in the pipeline did not handle the exception.
   io.netty.channel.unix.Errors$NativeIoException: accept(..) failed: Too many 
open files
   04:07:32.761 [pulsar-acceptor-23-1] WARN  
io.netty.channel.DefaultChannelPipeline - An exceptionCaught() event was fired, 
and it reached at the tail of the pipeline. It usually means the last handler 
in the pipeline did not handle the exception.
   io.netty.channel.unix.Errors$NativeIoException: accept(..) failed: Too many 
open files
   04:07:33.761 [pulsar-acceptor-23-1] WARN  
io.netty.channel.DefaultChannelPipeline - An exceptionCaught() event was fired, 
and it reached at the tail of the pipeline. It usually means the last handler 
in the pipeline did not handle the exception.
   io.netty.channel.unix.Errors$NativeIoException: accept(..) failed: Too many 
open files
   ```
   
   Listing open files shows that large number of connections are in 
`CLOSED_WAIT` state but we don't see any other additional information when 
broker goes in that state.
   
   **Please let us know if anyone is facing the similar issue so, we can avoid 
wasting time in investigation and then creating PR because after spending lot 
of time we don't want to see duplicate PR coming and fixing the issue.**
   
   ### What did you expect to see?
   
   Broker should not go in such an irresponsive state.
   
   ### What did you see instead?
   
   Client started seeing publish timeout.
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to