Shawyeok opened a new issue #13194:
URL: https://github.com/apache/pulsar/issues/13194


   **Describe the bug**
   After upgraded our staging cluster from `2.6.3` to `2.8.1`, there are so 
many `reset position` logs with same thread and same cursor:
   
![image](https://user-images.githubusercontent.com/5058708/145220305-2454861b-2be9-4538-8a48-b8ca0aa426e8.png)
   
   thread `pulsar-backlog-quota-checker-31-1` is keeping `resetCursor` with 
same position forever:
   ```
   [arthas@32374]$ thread 95
   "pulsar-backlog-quota-checker-31-1" Id=95 TIMED_WAITING on 
java.util.concurrent.CountDownLatch$Sync@2e32da0
       at sun.misc.Unsafe.park(Native Method)
       -  waiting on java.util.concurrent.CountDownLatch$Sync@2e32da0
       at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
       at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
       at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
       at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
       at 
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.resetCursor(ManagedCursorImpl.java:1182)
       at 
org.apache.pulsar.broker.service.BacklogQuotaManager.dropBacklogForTimeLimit(BacklogQuotaManager.java:251)
       at 
org.apache.pulsar.broker.service.BacklogQuotaManager.handleExceededBacklogQuota(BacklogQuotaManager.java:129)
       at 
org.apache.pulsar.broker.service.BrokerService.lambda$monitorBacklogQuota$66(BrokerService.java:1628)
       at 
org.apache.pulsar.broker.service.BrokerService$$Lambda$456/2083405311.accept(Unknown
 Source)
       at 
org.apache.pulsar.broker.service.BrokerService$$Lambda$889/1927928842.accept(Unknown
 Source)
       at java.util.Optional.ifPresent(Optional.java:159)
       at 
org.apache.pulsar.broker.service.BrokerService.lambda$forEachTopic$65(BrokerService.java:1612)
       at 
org.apache.pulsar.broker.service.BrokerService$$Lambda$454/15749765.accept(Unknown
 Source)
       at 
org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:387)
       at 
org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:159)
       at 
org.apache.pulsar.broker.service.BrokerService.forEachTopic(BrokerService.java:1610)
       at 
org.apache.pulsar.broker.service.BrokerService.monitorBacklogQuota(BrokerService.java:1621)
       at 
org.apache.pulsar.broker.service.BrokerService$$Lambda$162/2143700452.run(Unknown
 Source)
       at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32)
       at 
org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36)
       at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
       at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
       at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
       at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
       at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
       at java.lang.Thread.run(Thread.java:748)
   
   [arthas@32374]$ watch org.apache.bookkeeper.mledger.impl.ManagedCursorImpl 
resetCursor params[0] 'params[0].ledgerId==4502648' -n 5
   Press Q or Ctrl+C to abort.
   Affect(class count: 1 , method count: 1) cost in 211 ms, listenerId: 3
   method=org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.resetCursor 
location=AtExit
   ts=2021-12-08 22:05:59; [cost=14.467419ms] result=@PositionImpl[
       ledgerId=@Long[4502648],
       entryId=@Long[5],
       ackSet=null,
       earliest=@PositionImpl[-1:-1],
       latest=@PositionImpl[9223372036854775807:9223372036854775807],
   ]
   method=org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.resetCursor 
location=AtExit
   ts=2021-12-08 22:05:59; [cost=14.628293ms] result=@PositionImpl[
       ledgerId=@Long[4502648],
       entryId=@Long[5],
       ackSet=null,
       earliest=@PositionImpl[-1:-1],
       latest=@PositionImpl[9223372036854775807:9223372036854775807],
   ]
   method=org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.resetCursor 
location=AtExit
   ts=2021-12-08 22:05:59; [cost=15.48457ms] result=@PositionImpl[
       ledgerId=@Long[4502648],
       entryId=@Long[5],
       ackSet=null,
       earliest=@PositionImpl[-1:-1],
       latest=@PositionImpl[9223372036854775807:9223372036854775807],
   ]
   method=org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.resetCursor 
location=AtExit
   ts=2021-12-08 22:05:59; [cost=14.051181ms] result=@PositionImpl[
       ledgerId=@Long[4502648],
       entryId=@Long[5],
       ackSet=null,
       earliest=@PositionImpl[-1:-1],
       latest=@PositionImpl[9223372036854775807:9223372036854775807],
   ]
   method=org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.resetCursor 
location=AtExit
   ts=2021-12-08 22:05:59; [cost=15.2071ms] result=@PositionImpl[
       ledgerId=@Long[4502648],
       entryId=@Long[5],
       ackSet=null,
       earliest=@PositionImpl[-1:-1],
       latest=@PositionImpl[9223372036854775807:9223372036854775807],
   ]
   Command execution times exceed limit: 5, so command will exit. You can set 
it with -n option.
   ```
   
   `internal-stats` of the topic:
   ```json
   {
     "waitingCursorsCount": 2,
     "pendingAddEntriesCount": 0,
     "entriesAddedCounter": 0,
     "numberOfEntries": 5,
     "totalSize": 2190,
     "currentLedgerEntries": 0,
     "currentLedgerSize": 0,
     "lastLedgerCreatedTimestamp": "2021-12-07T21:21:06.399+08:00",
     "lastLedgerCreationFailureTimestamp": "",
     "lastConfirmedEntry": "4502648:4",
     "state": "LedgerOpened",
     "ledgers": [
       {
         "ledgerId": 4502648,
         "entries": 5,
         "size": 2190,
         "offloaded": false
       },
       {
         "ledgerId": 4629342,
         "entries": 0,
         "size": 0,
         "offloaded": false
       }
     ],
     "cursors": {
       "java-b-notification": {
         "markDeletePosition": "4502648:4",
         "readPosition": "4502648:5",
         "waitingReadOp": true,
         "pendingReadOps": 0,
         "messagesConsumedCounter": 0,
         "cursorLedger": 4647267,
         "cursorLedgerLastEntry": 13249,
         "individuallyDeletedMessages": "[]",
         "lastLedgerWitchTimestamp": "",
         "state": "Open",
         "numberOfEntriesSinceFirstNotAckedMessage": 1,
         "totalNonContiguousDeletedMessagesRange": 0,
         "properties": {}
       }
     }
   }
   ```
   
   ledger `4502648` metadata:
   ```json
   {
     "4502648": {
       "storeCtime": false,
       "hasPassword": false,
       "metadataFormatVersion": 3,
       "ensembleSize": 2,
       "writeQuorumSize": 2,
       "ackQuorumSize": 2,
       "length": 2190,
       "lastEntryId": 4,
       "ctime": 1637199000612,
       "cToken": 0,
       "state": "CLOSED",
       "digestType": "CRC32C",
       "allEnsembles": {
         "0": [
           {
             "port": 0,
             "hostname": ""
           },
           {
             "port": 0,
             "hostname": ""
           }
         ]
       },
       "currentEnsemble": null,
       "password": "",
       "customMetadata": {
         "application": "cHVsc2Fy",
         "component": "bWFuYWdlZC1sZWRnZXI=",
         "pulsar/managed-ledger": 
"Yi9qb2J0b3AvcGVyc2lzdGVudC90YWxlbnRib21iLnNlc3Npb24uY3JlYXRlZC1wYXJ0aXRpb24tMQ=="
       }
     }
   }
   ```
   
   <details>
   <summary>broker.conf</summary>
   <pre>
   zookeeperServers=127.0.0.1:2181
   configurationStoreServers=127.0.0.1:2181
   brokerServicePort=6650
   brokerServicePortTls=6651
   webServicePort=8080
   webServicePortTls=8443
   bindAddress=0.0.0.0
   advertisedAddress=127.0.0.1
   haProxyProtocolEnabled=false
   numIOThreads=32
   numOrderedExecutorThreads=8
   numCacheExecutorThreadPoolSize=10
   enableBusyWait=false
   maxConcurrentHttpRequests=1024
   clusterName=test-cluster
   maxTenants=0
   failureDomainsEnabled=false
   zooKeeperSessionTimeoutMillis=30000
   zooKeeperOperationTimeoutSeconds=30
   zooKeeperCacheExpirySeconds=300
   brokerShutdownTimeoutMs=60000
   skipBrokerShutdownOnOOM=false
   backlogQuotaCheckEnabled=true
   backlogQuotaCheckIntervalInSeconds=60
   backlogQuotaDefaultLimitGB=10
   backlogQuotaDefaultLimitSecond=259200
   backlogQuotaDefaultRetentionPolicy=consumer_backlog_eviction
   ttlDurationDefaultInSeconds=0
   allowAutoTopicCreation=true
   allowAutoTopicCreationType=partitioned
   allowAutoSubscriptionCreation=true
   defaultNumPartitions=3
   brokerDeleteInactiveTopicsEnabled=true
   brokerDeleteInactiveTopicsFrequencySeconds=60
   brokerDeleteInactiveTopicsMode=delete_when_subscriptions_caught_up
   brokerDeleteInactivePartitionedTopicMetadataEnabled=false
   brokerDeleteInactiveTopicsMaxInactiveDurationSeconds=259200
   forceDeleteTenantAllowed=false
   forceDeleteNamespaceAllowed=false
   maxPendingPublishRequestsPerConnection=2000
   messageExpiryCheckIntervalInMinutes=5
   activeConsumerFailoverDelayTimeMillis=1000
   subscriptionExpirationTimeMinutes=4320
   subscriptionRedeliveryTrackerEnabled=true
   subscriptionExpiryCheckIntervalInMinutes=5
   subscriptionTypesEnabled=Exclusive,Shared,Failover,Key_Shared
   subscriptionKeySharedEnable=true
   subscriptionKeySharedUseConsistentHashing=false
   subscriptionKeySharedConsistentHashingReplicaPoints=100
   brokerDeduplicationEnabled=false
   brokerDeduplicationMaxNumberOfProducers=10000
   brokerDeduplicationSnapshotFrequencyInSeconds=10
   brokerDeduplicationSnapshotIntervalSeconds=120
   brokerDeduplicationEntriesInterval=1000
   brokerDeduplicationProducerInactivityTimeoutMinutes=360
   defaultNumberOfNamespaceBundles=4
   maxNamespacesPerTenant=0
   maxTopicsPerNamespace=0
   clientLibraryVersionCheckEnabled=false
   preferLaterVersions=false
   maxUnackedMessagesPerConsumer=500
   maxUnackedMessagesPerSubscription=5000
   maxUnackedMessagesPerBroker=0
   maxUnackedMessagesPerSubscriptionOnBrokerBlocked=0.16
   unblockStuckSubscriptionEnabled=false
   topicPublisherThrottlingTickTimeMillis=10
   preciseTopicPublishRateLimiterEnable=false
   brokerPublisherThrottlingTickTimeMillis=50
   brokerPublisherThrottlingMaxMessageRate=0
   brokerPublisherThrottlingMaxByteRate=0
   maxPublishRatePerTopicInMessages=0
   maxPublishRatePerTopicInBytes=0
   subscribeThrottlingRatePerConsumer=0
   subscribeRatePeriodPerConsumerInSecond=30
   dispatchThrottlingRatePerTopicInMsg=0
   dispatchThrottlingRatePerTopicInByte=0
   dispatchThrottlingRatePerSubscriptionInMsg=0
   dispatchThrottlingRatePerSubscriptionInByte=0
   dispatchThrottlingRatePerReplicatorInMsg=0
   dispatchThrottlingRatePerReplicatorInByte=0
   dispatchThrottlingRateRelativeToPublishRate=false
   dispatchThrottlingOnNonBacklogConsumerEnabled=true
   dispatcherMaxReadBatchSize=100
   dispatcherMaxReadSizeBytes=5242880
   dispatcherMinReadBatchSize=1
   dispatcherMaxRoundRobinBatchSize=20
   preciseDispatcherFlowControl=false
   maxConcurrentLookupRequest=50000
   maxConcurrentTopicLoadRequest=5000
   maxConcurrentNonPersistentMessagePerConnection=1000
   numWorkerThreadsForNonPersistentTopic=8
   enablePersistentTopics=true
   enableNonPersistentTopics=true
   enableRunBookieTogether=false
   enableRunBookieAutoRecoveryTogether=false
   maxProducersPerTopic=10000
   maxSameAddressProducersPerTopic=0
   encryptionRequireOnProducer=false
   maxConsumersPerTopic=10000
   maxSameAddressConsumersPerTopic=0
   maxSubscriptionsPerTopic=0
   maxConsumersPerSubscription=2000
   maxMessageSize=5242880
   brokerServiceCompactionMonitorIntervalInSeconds=60
   brokerServiceCompactionThresholdInBytes=0
   brokerServiceCompactionPhaseOneLoopTimeInSeconds=30
   delayedDeliveryEnabled=true
   delayedDeliveryTickTimeMillis=1000
   acknowledgmentAtBatchIndexLevelEnabled=false
   enableReplicatedSubscriptions=true
   replicatedSubscriptionsSnapshotFrequencyMillis=1000
   replicatedSubscriptionsSnapshotTimeoutSeconds=30
   replicatedSubscriptionsSnapshotMaxCachedPerSubscription=10
   retentionCheckIntervalInSeconds=120
   maxNumPartitionsPerPartitionedTopic=0
   zookeeperSessionExpiredPolicy=shutdown
   systemTopicEnabled=true
   topicLevelPoliciesEnabled=true
   topicFencingTimeoutSeconds=0
   authenticateOriginalAuthData=false
   tlsEnabled=false
   tlsCertRefreshCheckDurationSec=300
   tlsAllowInsecureConnection=false
   tlsRequireTrustedClientCertOnConnect=false
   tlsEnabledWithKeyStore=false
   tlsKeyStoreType=JKS
   tlsTrustStoreType=JKS
   brokerClientTlsEnabledWithKeyStore=false
   brokerClientTlsTrustStoreType=JKS
   authenticationEnabled=false
   authenticationRefreshCheckSeconds=60
   authorizationEnabled=false
   
authorizationProvider=org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider
   authorizationAllowWildcardsMatching=false
   brokerClientTlsEnabled=false
   httpMaxRequestSize=-1
   disableHttpDebugMethods=false
   httpRequestsLimitEnabled=false
   httpRequestsMaxPerSecond=100.0
   bookkeeperClientTimeoutInSeconds=30
   bookkeeperClientSpeculativeReadTimeoutInMillis=0
   bookkeeperNumberOfChannelsPerBookie=32
   bookkeeperUseV2WireProtocol=false
   bookkeeperClientHealthCheckEnabled=true
   bookkeeperClientHealthCheckIntervalSeconds=60
   bookkeeperClientHealthCheckErrorThresholdPerInterval=5
   bookkeeperClientHealthCheckQuarantineTimeInSeconds=1800
   bookkeeperClientQuarantineRatio=1.0
   bookkeeperClientGetBookieInfoIntervalSeconds=86400
   bookkeeperClientGetBookieInfoRetryIntervalSeconds=60
   bookkeeperClientRackawarePolicyEnabled=true
   bookkeeperClientRegionawarePolicyEnabled=false
   bookkeeperClientMinNumRacksPerWriteQuorum=2
   bookkeeperClientEnforceMinNumRacksPerWriteQuorum=false
   bookkeeperClientReorderReadSequenceEnabled=false
   bookkeeperEnableStickyReads=false
   bookkeeperTLSProviderFactoryClass=org.apache.bookkeeper.tls.TLSContextFactory
   bookkeeperTLSClientAuthentication=false
   bookkeeperTLSKeyFileType=PEM
   bookkeeperTLSTrustCertTypes=PEM
   bookkeeperDiskWeightBasedPlacementEnabled=false
   bookkeeperExplicitLacIntervalInMills=0
   bookkeeperClientExposeStatsToPrometheus=true
   managedLedgerDefaultEnsembleSize=2
   managedLedgerDefaultWriteQuorum=2
   managedLedgerDefaultAckQuorum=2
   managedLedgerCursorPositionFlushSeconds = 60
   managedLedgerDigestType=CRC32C
   managedLedgerNumWorkerThreads=16
   managedLedgerNumSchedulerThreads=8
   managedLedgerCacheSizeMB=1024
   managedLedgerCacheCopyEntries=false
   managedLedgerCacheEvictionWatermark=0.9
   managedLedgerCacheEvictionFrequency=100.0
   managedLedgerCacheEvictionTimeThresholdMillis=1000
   managedLedgerCursorBackloggedThreshold=1000
   managedLedgerDefaultMarkDeleteRateLimit=1.0
   managedLedgerMaxEntriesPerLedger=50000
   managedLedgerMinLedgerRolloverTimeMinutes=10
   managedLedgerMaxLedgerRolloverTimeMinutes=240
   managedLedgerMaxSizePerLedgerMbytes=2048
   managedLedgerOffloadDeletionLagMs=14400000
   managedLedgerOffloadAutoTriggerSizeThresholdBytes=-1
   managedLedgerCursorMaxEntriesPerLedger=50000
   managedLedgerCursorRolloverTimeInSeconds=14400
   managedLedgerMaxUnackedRangesToPersist=1024
   managedLedgerMaxUnackedRangesToPersistInZooKeeper=1000
   autoSkipNonRecoverableData=true
   lazyCursorRecovery=false
   managedLedgerMetadataOperationsTimeoutSeconds=60
   managedLedgerReadEntryTimeoutSeconds=0
   managedLedgerAddEntryTimeoutSeconds=0
   managedLedgerPrometheusStatsLatencyRolloverSeconds=60
   managedLedgerTraceTaskExecution=true
   managedLedgerNewEntriesCheckDelayInMillis=10
   loadBalancerEnabled=true
   loadBalancerReportUpdateThresholdPercentage=10
   loadBalancerReportUpdateMaxIntervalMinutes=15
   loadBalancerHostUsageCheckIntervalMinutes=1
   loadBalancerSheddingEnabled=true
   loadBalancerSheddingIntervalMinutes=1
   loadBalancerSheddingGracePeriodMinutes=30
   loadBalancerBrokerMaxTopics=50000
   loadBalancerBrokerOverloadedThresholdPercentage=85
   loadBalancerResourceQuotaUpdateIntervalMinutes=15
   loadBalancerAutoBundleSplitEnabled=true
   loadBalancerAutoUnloadSplitBundlesEnabled=true
   loadBalancerNamespaceBundleMaxTopics=1000
   loadBalancerNamespaceBundleMaxSessions=1000
   loadBalancerNamespaceBundleMaxMsgRate=30000
   loadBalancerNamespaceBundleMaxBandwidthMbytes=100
   loadBalancerNamespaceMaximumBundles=128
   
loadManagerClassName=org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl
   
supportedNamespaceBundleSplitAlgorithms=range_equally_divide,topic_count_equally_divide
   defaultNamespaceBundleSplitAlgorithm=range_equally_divide
   
loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl.OverloadShedder
   loadBalancerBrokerThresholdShedderPercentage=10
   loadBalancerHistoryResourcePercentage=0.9
   loadBalancerBandwithInResourceWeight=1.0
   loadBalancerBandwithOutResourceWeight=1.0
   loadBalancerCPUResourceWeight=1.0
   loadBalancerMemoryResourceWeight=1.0
   loadBalancerDirectMemoryResourceWeight=1.0
   loadBalancerBundleUnloadMinThroughputThreshold=10
   replicationMetricsEnabled=true
   replicationConnectionsPerBroker=16
   replicationProducerQueueSize=1000
   replicatorPrefix=pulsar.repl
   replicationPolicyCheckDurationSeconds=600
   defaultRetentionTimeInMinutes=0
   defaultRetentionSizeInMB=0
   keepAliveIntervalSeconds=30
   webSocketServiceEnabled=false
   webSocketNumIoThreads=8
   webSocketConnectionsPerBroker=8
   webSocketSessionIdleTimeoutMillis=300000
   webSocketMaxTextFrameSize=1048576
   exposeTopicLevelMetricsInPrometheus=true
   exposeConsumerLevelMetricsInPrometheus=false
   exposeProducerLevelMetricsInPrometheus=false
   exposeManagedLedgerMetricsInPrometheus=true
   exposeManagedCursorMetricsInPrometheus=true
   metricsServletTimeoutMs=30000
   functionsWorkerEnabled=false
   exposePublisherStats=true
   statsUpdateFrequencyInSecs=60
   statsUpdateInitialDelayInSecs=60
   exposePreciseBacklogInPrometheus=false
   
schemaRegistryStorageClassName=org.apache.pulsar.broker.service.schema.BookkeeperSchemaStorageFactory
   isSchemaValidationEnforced=false
   offloadersDirectory=./offloaders
   managedLedgerOffloadMaxThreads=2
   managedLedgerOffloadPrefetchRounds=1
   managedLedgerUnackedRangesOpenCacheSetEnabled=true
   s3ManagedLedgerOffloadMaxBlockSizeInBytes=67108864
   s3ManagedLedgerOffloadReadBufferSizeInBytes=1048576
   gcsManagedLedgerOffloadMaxBlockSizeInBytes=67108864
   gcsManagedLedgerOffloadReadBufferSizeInBytes=1048576
   fileSystemProfilePath=../conf/filesystem_offload_core_site.xml
   replicationTlsEnabled=false
   brokerServicePurgeInactiveFrequencyInSeconds=60
   transactionCoordinatorEnabled=false
   
transactionMetadataStoreProviderClassName=org.apache.pulsar.transaction.coordinator.impl.InMemTransactionMetadataStoreProvider
   transactionBufferSnapshotMaxTransactionCount=1000
   transactionBufferSnapshotMinTimeInMillis=5000
   enablePackagesManagement=false
   
packagesManagementStorageProvider=org.apache.pulsar.packages.management.storage.bookkeeper.BookKeeperPackagesStorageProvider
   packagesReplicas=1
   packagesManagementLedgerRootPath=/ledgers
   </pre>
   </details>
   
   **Additional context**
   A workaround is force unsubscribe this subscription, but another 
subscription cursor will show up and keep logging...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to