catsun26 opened a new issue, #17814:
URL: https://github.com/apache/pulsar/issues/17814

   ### Search before asking
   
   - [X] I searched in the [issues](https://github.com/apache/pulsar/issues) 
and found nothing similar.
   
   
   ### Version
   
   centos 7.9
   apache-pulsar-2.8.0  one pulsar cluster 
   apache-pulsar-2.8.0 + broker-2.8.1  and two pulsar cluster
   
   
   ### Minimal reproduce step
   
   1.Change the attribute in the broker.conf  from 322 to 222, restart broker 
and stop one bookie's server
   managedLedgerDefaultEnsembleSize=3
   managedLedgerDefaultWriteQuorum=2
   managedLedgerDefaultAckQuorum=2
   Change to
   managedLedgerDefaultEnsembleSize=2
   managedLedgerDefaultWriteQuorum=2
   managedLedgerDefaultAckQuorum=2
   
   2.[root@pulsar1 apache-pulsar-2.8.0]# pulsar-admin brokers 
get-runtime-config |grep managedLedgerDefaultEnsembleSize
   "managedLedgerDefaultEnsembleSize    2"
   
   3. pulsar-client  produce bsc/k8s/test -n 1 -m "hello"
   
   14:29:47.029 [pulsar-client-io-1-1] INFO  
org.apache.pulsar.client.impl.ConnectionPool - [[id: 0x3d78fba3, 
L:/127.0.0.1:40608 - R:localhost/127.0.0.1:6660]] Connected to server
   14:29:47.136 [pulsar-client-io-1-1] INFO  
org.apache.pulsar.client.impl.ProducerStatsRecorderImpl - Starting Pulsar 
producer perf with config: {
     "topicName" : "bsc/k8s/test",
     "producerName" : null,
     "sendTimeoutMs" : 30000,
     "blockIfQueueFull" : false,
     "maxPendingMessages" : 1000,
     "maxPendingMessagesAcrossPartitions" : 50000,
     "messageRoutingMode" : "RoundRobinPartition",
     "hashingScheme" : "JavaStringHash",
     "cryptoFailureAction" : "FAIL",
     "batchingMaxPublishDelayMicros" : 1000,
     "batchingPartitionSwitchFrequencyByPublishDelay" : 10,
     "batchingMaxMessages" : 1000,
     "batchingMaxBytes" : 131072,
     "batchingEnabled" : true,
     "chunkingEnabled" : false,
     "compressionType" : "NONE",
     "initialSequenceId" : null,
     "autoUpdatePartitions" : true,
     "autoUpdatePartitionsIntervalSeconds" : 60,
     "multiSchema" : true,
     "accessMode" : "Shared",
     "properties" : { }
   }
   14:29:47.146 [pulsar-client-io-1-1] INFO  
org.apache.pulsar.client.impl.ProducerStatsRecorderImpl - Pulsar client config: 
{
     "serviceUrl" : "pulsar://localhost:6660/",
     "authPluginClassName" : null,
     "authParams" : null,
     "authParamMap" : null,
     "operationTimeoutMs" : 30000,
     "statsIntervalSeconds" : 60,
     "numIoThreads" : 1,
     "numListenerThreads" : 1,
     "connectionsPerBroker" : 1,
     "useTcpNoDelay" : true,
     "useTls" : false,
     "tlsTrustCertsFilePath" : "",
     "tlsAllowInsecureConnection" : false,
     "tlsHostnameVerificationEnable" : false,
     "concurrentLookupRequest" : 5000,
     "maxLookupRequest" : 50000,
     "maxLookupRedirects" : 20,
     "maxNumberOfRejectedRequestPerConnection" : 50,
     "keepAliveIntervalSeconds" : 30,
     "connectionTimeoutMs" : 10000,
     "requestTimeoutMs" : 60000,
     "initialBackoffIntervalNanos" : 100000000,
     "maxBackoffIntervalNanos" : 60000000000,
     "enableBusyWait" : false,
     "listenerName" : null,
     "useKeyStoreTls" : false,
     "sslProvider" : null,
     "tlsTrustStoreType" : "JKS",
     "tlsTrustStorePath" : "",
     "tlsTrustStorePassword" : "",
     "tlsCiphers" : [ ],
     "tlsProtocols" : [ ],
     "memoryLimitBytes" : 0,
     "proxyServiceUrl" : null,
     "proxyProtocol" : null,
     "enableTransaction" : false
   }
   14:29:47.201 [pulsar-client-io-1-1] INFO  
org.apache.pulsar.client.impl.ConnectionPool - [[id: 0xd4f73691, 
L:/127.0.0.1:40614 - R:localhost/127.0.0.1:6660]] Connected to server
   14:29:47.201 [pulsar-client-io-1-1] INFO  
org.apache.pulsar.client.impl.ClientCnx - [id: 0xd4f73691, L:/127.0.0.1:40614 - 
R:localhost/127.0.0.1:6660] Connected through proxy to target broker at 
pulsar3:6650
   14:29:47.208 [pulsar-client-io-1-1] INFO  
org.apache.pulsar.client.impl.ProducerImpl - [bsc/k8s/test] [null] Creating 
producer on cnx [id: 0xd4f73691, L:/127.0.0.1:40614 - 
R:localhost/127.0.0.1:6660]
   14:29:47.230 [pulsar-client-io-1-1] WARN  
org.apache.pulsar.client.impl.ClientCnx - [id: 0xd4f73691, L:/127.0.0.1:40614 - 
R:localhost/127.0.0.1:6660] Received error from server: 
org.apache.bookkeeper.mledger.ManagedLedgerException: Not enough non-faulty 
bookies available
   14:29:47.231 [pulsar-client-io-1-1] ERROR 
org.apache.pulsar.client.impl.ProducerImpl - [bsc/k8s/test] [null] Failed to 
create producer: org.apache.bookkeeper.mledger.ManagedLedgerException: Not 
enough non-faulty bookies available
   14:29:47.232 [pulsar-client-io-1-1] WARN  
org.apache.pulsar.client.impl.ConnectionHandler - [bsc/k8s/test] [null] Could 
not get connection to broker: 
org.apache.bookkeeper.mledger.ManagedLedgerException: Not enough non-faulty 
bookies available -- Will try again in 0.1 s
   14:29:47.333 [pulsar-timer-5-1] INFO  
org.apache.pulsar.client.impl.ConnectionHandler - [bsc/k8s/test] [null] 
Reconnecting after connection was closed
   
   4. broker's logs
   .PerChannelBookieClient - Disconnected from bookie channel [id: 0xaff4064a, 
L:/192.168.209.83:54132 ! R:192.168.209.81/192.168.209.81:3181]
   14:29:19.727 [pulsar-io-4-7] INFO  
org.apache.bookkeeper.proto.PerChannelBookieClient - Disconnected from bookie 
channel [id: 0xe474d3ad, L:/192.168.209.83:54119 ! 
R:192.168.209.81/192.168.209.81:3181]
   14:29:19.727 [pulsar-io-4-2] WARN  
org.apache.bookkeeper.proto.PerChannelBookieClient - Exception caught on:[id: 
0x74fdd840, L:/192.168.209.83:54112 - R:192.168.209.81/192.168.209.81:3181] 
cause: readAddress(..) failed: Connection reset by peer
   14:29:19.727 [pulsar-io-4-2] INFO  
org.apache.bookkeeper.proto.PerChannelBookieClient - Disconnected from bookie 
channel [id: 0x74fdd840, L:/192.168.209.83:54112 ! 
R:192.168.209.81/192.168.209.81:3181]
   14:29:19.727 [pulsar-io-4-1] WARN  
org.apache.bookkeeper.proto.PerChannelBookieClient - Exception caught on:[id: 
0x320e056f, L:/192.168.209.83:54128 - R:192.168.209.81/192.168.209.81:3181] 
cause: readAddress(..) failed: Connection reset by peer
   14:29:19.727 [pulsar-io-4-1] INFO  
org.apache.bookkeeper.proto.PerChannelBookieClient - Disconnected from bookie 
channel [id: 0x320e056f, L:/192.168.209.83:54128 ! 
R:192.168.209.81/192.168.209.81:3181]
   14:29:19.779 [main-EventThread] INFO  
org.apache.bookkeeper.discover.ZKRegistrationClient - Invalidate cache for 
192.168.209.81:3181
   14:29:19.779 [main-EventThread] INFO  
org.apache.bookkeeper.discover.ZKRegistrationClient - Invalidate cache for 
192.168.209.81:3181
   14:29:19.783 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  
org.apache.bookkeeper.net.NetworkTopologyImpl - Removing a node: 
/default-rack/192.168.209.81:3181
   14:29:19.783 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  
org.apache.bookkeeper.net.NetworkTopologyImpl - Removing a node: 
/default-rack/192.168.209.81:3181
   14:29:20.897 [pulsar-web-40-3] INFO  org.eclipse.jetty.server.RequestLog - 
192.168.209.81 - - [23/九月/2022:14:29:20 +0800] "GET /metrics HTTP/1.1" 302 0 
"-" "Prometheus/2.29.1" 0
   14:29:20.900 [prometheus-stats-41-1] INFO  
org.eclipse.jetty.server.RequestLog - 192.168.209.81 - - [23/九月/2022:14:29:20 
+0800] "GET /metrics/ HTTP/1.1" 200 28244 "http://192.168.209.83:8080/metrics"; 
"Prometheus/2.29.1" 2
   14:29:21.884 [pulsar-web-40-8] INFO  org.eclipse.jetty.server.RequestLog - 
192.168.209.81 - - [23/九月/2022:14:29:21 +0800] "GET /metrics HTTP/1.1" 302 0 
"-" "Prometheus/2.29.1" 1
   14:29:21.887 [prometheus-stats-41-1] INFO  
org.eclipse.jetty.server.RequestLog - 192.168.209.81 - - [23/九月/2022:14:29:21 
+0800] "GET /metrics/ HTTP/1.1" 200 28244 "http://192.168.209.83:8080/metrics"; 
"Prometheus/2.29.1" 3
   14:29:35.897 [pulsar-web-40-1] INFO  org.eclipse.jetty.server.RequestLog - 
192.168.209.81 - - [23/九月/2022:14:29:35 +0800] "GET /metrics HTTP/1.1" 302 0 
"-" "Prometheus/2.29.1" 0
   14:29:35.900 [prometheus-stats-41-1] INFO  
org.eclipse.jetty.server.RequestLog - 192.168.209.81 - - [23/九月/2022:14:29:35 
+0800] "GET /metrics/ HTTP/1.1" 200 28244 "http://192.168.209.83:8080/metrics"; 
"Prometheus/2.29.1" 3
   14:29:36.884 [pulsar-web-40-4] INFO  org.eclipse.jetty.server.RequestLog - 
192.168.209.81 - - [23/九月/2022:14:29:36 +0800] "GET /metrics HTTP/1.1" 302 0 
"-" "Prometheus/2.29.1" 1
   14:29:36.891 [prometheus-stats-41-1] INFO  
org.eclipse.jetty.server.RequestLog - 192.168.209.81 - - [23/九月/2022:14:29:36 
+0800] "GET /metrics/ HTTP/1.1" 200 28245 "http://192.168.209.83:8080/metrics"; 
"Prometheus/2.29.1" 6
   14:29:47.058 [pulsar-io-4-5] INFO  
org.apache.pulsar.broker.service.ServerCnx - New connection from 
/192.168.209.81:33458
   14:29:47.208 [pulsar-io-4-6] INFO  
org.apache.pulsar.broker.service.ServerCnx - New connection from 
/192.168.209.81:33464
   14:29:47.215 [pulsar-io-4-6] INFO  
org.apache.pulsar.broker.service.ServerCnx - 
[/192.168.209.81:33464][persistent://bsc/k8s/test] Creating producer. 
producerId=0
   14:29:47.216 [pulsar-ordered-OrderedExecutor-3-0] INFO  
org.apache.pulsar.broker.PulsarService - No ledger offloader configured, using 
NULL instance
   14:29:47.216 [pulsar-ordered-OrderedExecutor-3-0] INFO  
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - Opening managed ledger 
bsc/k8s/persistent/test
   14:29:47.217 [bookkeeper-ml-scheduler-OrderedScheduler-4-0] INFO  
org.apache.bookkeeper.mledger.impl.MetaStoreImpl - Creating 
'/managed-ledgers/bsc/k8s/persistent/test'
   14:29:47.220 [pulsar-ordered-OrderedExecutor-1-0-EventThread] INFO  
org.apache.pulsar.zookeeper.ZooKeeperCache - [State:CONNECTED Timeout:30000 
sessionid:0x100004f5f9b000b local:/192.168.209.83:36584 
remoteserver:pulsar1/192.168.209.81:2181 lastZxid:30064773910 xid:251 sent:251 
recv:254 queuedpkts:0 pendingresp:0 queuedevents:1] Received ZooKeeper watch 
event: WatchedEvent state:SyncConnected type:NodeCreated 
path:/managed-ledgers/bsc/k8s/persistent/test
   14:29:47.221 [metadata-store-6-1] INFO  
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - 
[bsc/k8s/persistent/test] Creating ledger, metadata: {component=[109, 97, 110, 
97, 103, 101, 100, 45, 108, 101, 100, 103, 101, 114], 
pulsar/managed-ledger=[98, 115, 99, 47, 107, 56, 115, 47, 112, 101, 114, 115, 
105, 115, 116, 101, 110, 116, 47, 116, 101, 115, 116], application=[112, 117, 
108, 115, 97, 114]} - metadata ops timeout : 60 seconds
   14:29:47.221 [metadata-store-6-1] WARN  
org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to 
find 1 bookies : excludeBookies [<Bookie:192.168.209.82:3181>, 
<Bookie:192.168.209.83:3181>], allBookies [<Bookie:192.168.209.82:3181>, 
<Bookie:192.168.209.83:3181>].
   14:29:47.221 [metadata-store-6-1] WARN  
org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to 
find 1 bookies : excludeBookies [<Bookie:192.168.209.82:3181>, 
<Bookie:192.168.209.83:3181>], allBookies [<Bookie:192.168.209.82:3181>, 
<Bookie:192.168.209.83:3181>].
   14:29:47.221 [metadata-store-6-1] ERROR 
org.apache.bookkeeper.client.LedgerCreateOp - Not enough bookies to create 
ledger with ensembleSize=3, writeQuorumSize=2 and ackQuorumSize=2
   14:29:47.221 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR 
org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl - 
[bsc/k8s/persistent/test] Failed to initialize managed ledger: Not enough 
non-faulty bookies available
   14:29:47.221 [pulsar-ordered-OrderedExecutor-1-0-EventThread] INFO  
org.apache.pulsar.zookeeper.ZooKeeperManagedLedgerCache - [State:CONNECTED 
Timeout:30000 sessionid:0x100004f5f9b000b local:/192.168.209.83:36584 
remoteserver:pulsar1/192.168.209.81:2181 lastZxid:30064773910 xid:251 sent:251 
recv:254 queuedpkts:0 pendingresp:0 queuedevents:0] Received ZooKeeper watch 
event: WatchedEvent state:SyncConnected type:NodeChildrenChanged 
path:/managed-ledgers/bsc/k8s/persistent
   14:29:47.221 [pulsar-ordered-OrderedExecutor-1-0-EventThread] INFO  
org.apache.pulsar.zookeeper.ZooKeeperManagedLedgerCache - invalidate called in 
zookeeperChildrenCache for path /managed-ledgers/bsc/k8s/persistent
   14:29:47.222 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - 
[bsc/k8s/persistent/test] Closing managed ledger
   14:29:47.222 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN  
org.apache.pulsar.broker.service.BrokerService - Failed to create topic 
persistent://bsc/k8s/test
   org.apache.bookkeeper.mledger.ManagedLedgerException: Not enough non-faulty 
bookies available
   14:29:47.350 [pulsar-io-4-6] INFO  
org.apache.pulsar.broker.service.ServerCnx - 
[/192.168.209.81:33464][persistent://bsc/k8s/test] Creating producer. 
producerId=0
   14:29:47.351 [pulsar-ordered-OrderedExecutor-3-0] INFO  
org.apache.pulsar.broker.PulsarService - No ledger offloader configured, using 
NULL instance
   14:29:47.351 [pulsar-ordered-OrderedExecutor-3-0] INFO  
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - Opening managed ledger 
bsc/k8s/persistent/test
   14:29:47.352 [bookkeeper-ml-scheduler-OrderedScheduler-4-0] INFO  
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - 
[bsc/k8s/persistent/test] Creating ledger, metadata: {component=[109, 97, 110, 
97, 103, 101, 100, 45, 108, 101, 100, 103, 101, 114], 
pulsar/managed-ledger=[98, 115, 99, 47, 107, 56, 115, 47, 112, 101, 114, 115, 
105, 115, 116, 101, 110, 116, 47, 116, 101, 115, 116], application=[112, 117, 
108, 115, 97, 114]} - metadata ops timeout : 60 seconds
   14:29:47.353 [bookkeeper-ml-scheduler-OrderedScheduler-4-0] WARN  
org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to 
find 1 bookies : excludeBookies [<Bookie:192.168.209.82:3181>, 
<Bookie:192.168.209.83:3181>], allBookies [<Bookie:192.168.209.82:3181>, 
<Bookie:192.168.209.83:3181>].
   14:29:47.353 [bookkeeper-ml-scheduler-OrderedScheduler-4-0] WARN  
org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to 
find 1 bookies : excludeBookies [<Bookie:192.168.209.82:3181>, 
<Bookie:192.168.209.83:3181>], allBookies [<Bookie:192.168.209.82:3181>, 
<Bookie:192.168.209.83:3181>].
   14:29:47.353 [bookkeeper-ml-scheduler-OrderedScheduler-4-0] ERROR 
org.apache.bookkeeper.client.LedgerCreateOp - Not enough bookies to create 
ledger with ensembleSize=3, writeQuorumSize=2 and ackQuorumSize=2
   14:29:47.353 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR 
org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl - 
[bsc/k8s/persistent/test] Failed to initialize managed ledger: Not enough 
non-faulty bookies available
   14:29:47.353 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - 
[bsc/k8s/persistent/test] Closing managed ledger
   14:29:47.353 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN  
org.apache.pulsar.broker.service.BrokerService - Failed to create topic 
persistent://bsc/k8s/test
   org.apache.bookkeeper.mledger.ManagedLedgerException: Not enough non-faulty 
bookies available
   
   
   
   
   
   
   
   ### What did you expect to see?
   
   The cluster can still produce and consume normally after stopping a bookie 
service
   
   ### What did you see instead?
   
   The puslar cluster cannot normally produce and consume when a bookmark 
service is stopped
   
   Change the attribute in the broker.conf  from 322 to 222, restart broker and 
stop one bookie's server
   managedLedgerDefaultEnsembleSize=3
   managedLedgerDefaultWriteQuorum=2
   managedLedgerDefaultAckQuorum=2
   Change to
   managedLedgerDefaultEnsembleSize=2
   managedLedgerDefaultWriteQuorum=2
   managedLedgerDefaultAckQuorum=2
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to