devinbost commented on issue #6054: URL: https://github.com/apache/pulsar/issues/6054#issuecomment-834127635
I noticed a pattern when the freeze occurs. After adding some additional debug lines, we get a call to `PersistentTopic.asyncAddEntry()`, and queue an `addEntry` request in `ManagedLedgerImpl`, and then we get 2,000 or more lines like this: ``` May 6 22:18:02 fab08 a7d8f24158d5: 2021-05-07T04:18:02,245 [bookkeeper-ml-scheduler-OrderedScheduler-7-0] DEBUG org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [myTenant/myFunction/persistent/filter-function] Ledger already created when timeout task is triggered May 6 22:18:02 fab08 a7d8f24158d5: 2021-05-07T04:18:02,245 [bookkeeper-ml-scheduler-OrderedScheduler-7-0] DEBUG org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [myTenant/myFunction/persistent/filter-function] createComplete rc=-23 ledger=-1 May 6 22:18:02 fab08 a7d8f24158d5: 2021-05-07T04:18:02,252 [bookkeeper-ml-scheduler-OrderedScheduler-7-0] DEBUG org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [myTenant/myFunction/persistent/filter-function] Ledger already created when timeout task is triggered May 6 22:18:02 fab08 a7d8f24158d5: 2021-05-07T04:18:02,252 [bookkeeper-ml-scheduler-OrderedScheduler-7-0] DEBUG org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [myTenant/myFunction/persistent/filter-function] createComplete rc=-23 ledger=-1 May 6 22:18:02 fab08 a7d8f24158d5: 2021-05-07T04:18:02,259 [bookkeeper-ml-scheduler-OrderedScheduler-1-0] DEBUG org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [myTenant/myFunction/persistent/filter-function] Ledger already created when timeout task is triggered May 6 22:18:02 fab08 a7d8f24158d5: 2021-05-07T04:18:02,259 [bookkeeper-ml-scheduler-OrderedScheduler-1-0] DEBUG org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [myTenant/myFunction/persistent/filter-function] createComplete rc=-23 ledger=-1 ``` After that, we get several thousand lines with the ping/pong cycle like this with almost no other broker activity on the affected broker: ``` May 6 22:19:20 fab08 a7d8f24158d5: 2021-05-07T04:19:20,517 [pulsar-io-28-54] DEBUG org.apache.pulsar.common.protocol.PulsarDecoder - [/10.20.69.29:49766] Received cmd PING May 6 22:19:20 fab08 a7d8f24158d5: 2021-05-07T04:19:20,517 [pulsar-io-28-54] DEBUG org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0x5d25db1f, L:/10.20.69.37:6650 - R:/10.20.69.29:49766]] Replying back to ping message May 6 22:19:20 fab08 a7d8f24158d5: 2021-05-07T04:19:20,519 [pulsar-io-28-54] DEBUG org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0x5d25db1f, L:/10.20.69.37:6650 - R:/10.20.69.29:49766]] Sending ping message May 6 22:19:20 fab08 a7d8f24158d5: 2021-05-07T04:19:20,519 [pulsar-io-28-54] DEBUG org.apache.pulsar.common.protocol.PulsarDecoder - [/10.20.69.29:49766] Received cmd PONG May 6 22:19:20 fab08 a7d8f24158d5: 2021-05-07T04:19:20,760 [pulsar-io-28-55] DEBUG org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0x9c75d310, L:/10.20.69.37:6650 - R:/10.20.69.37:51364]] Sending ping message May 6 22:19:20 fab08 a7d8f24158d5: 2021-05-07T04:19:20,762 [pulsar-io-28-55] DEBUG org.apache.pulsar.common.protocol.PulsarDecoder - [/10.20.69.37:51364] Received cmd PONG May 6 22:19:20 fab08 a7d8f24158d5: 2021-05-07T04:19:20,764 [pulsar-io-28-55] DEBUG org.apache.pulsar.common.protocol.PulsarDecoder - [/10.20.69.37:51364] Received cmd PING May 6 22:19:20 fab08 a7d8f24158d5: 2021-05-07T04:19:20,764 [pulsar-io-28-55] DEBUG org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0x9c75d310, L:/10.20.69.37:6650 - R:/10.20.69.37:51364]] Replying back to ping message May 6 22:19:20 fab08 a7d8f24158d5: 2021-05-07T04:19:20,968 [pulsar-io-28-58] DEBUG org.apache.pulsar.common.protocol.PulsarDecoder - [/10.20.69.28:60676] Received cmd PING May 6 22:19:20 fab08 a7d8f24158d5: 2021-05-07T04:19:20,968 [pulsar-io-28-58] DEBUG org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0x4465b5a6, L:/10.20.69.37:6651 - R:/10.20.69.28:60676]] Replying back to ping message May 6 22:19:20 fab08 a7d8f24158d5: 2021-05-07T04:19:20,969 [pulsar-io-28-58] DEBUG org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0x4465b5a6, L:/10.20.69.37:6651 - R:/10.20.69.28:60676]] Sending ping message May 6 22:19:20 fab08 a7d8f24158d5: 2021-05-07T04:19:20,969 [pulsar-io-28-58] DEBUG org.apache.pulsar.common.protocol.PulsarDecoder - [/10.20.69.28:60676] Received cmd PONG May 6 22:19:21 fab08 a7d8f24158d5: 2021-05-07T04:19:21,178 [pulsar-io-28-61] DEBUG org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0x8ca7517a, L:/10.20.69.37:6650 - R:/10.20.69.37:51370]] Sending ping message May 6 22:19:21 fab08 a7d8f24158d5: 2021-05-07T04:19:21,178 [pulsar-io-28-61] DEBUG org.apache.pulsar.common.protocol.PulsarDecoder - [/10.20.69.37:51370] Received cmd PONG May 6 22:19:21 fab08 a7d8f24158d5: 2021-05-07T04:19:21,181 [pulsar-io-28-61] DEBUG org.apache.pulsar.common.protocol.PulsarDecoder - [/10.20.69.37:51370] Received cmd PING May 6 22:19:21 fab08 a7d8f24158d5: 2021-05-07T04:19:21,181 [pulsar-io-28-61] DEBUG org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0x8ca7517a, L:/10.20.69.37:6650 - R:/10.20.69.37:51370]] Replying back to ping message May 6 22:19:22 fab08 a7d8f24158d5: 2021-05-07T04:19:22,910 [pulsar-io-28-63] DEBUG org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0xfb5df845, L:/10.20.69.37:6650 - R:/10.20.69.37:51376]] Sending ping message May 6 22:19:22 fab08 a7d8f24158d5: 2021-05-07T04:19:22,911 [pulsar-io-28-63] DEBUG org.apache.pulsar.common.protocol.PulsarDecoder - [/10.20.69.37:51376] Received cmd PONG May 6 22:19:22 fab08 a7d8f24158d5: 2021-05-07T04:19:22,913 [pulsar-io-28-63] DEBUG org.apache.pulsar.common.protocol.PulsarDecoder - [/10.20.69.37:51376] Received cmd PING May 6 22:19:22 fab08 a7d8f24158d5: 2021-05-07T04:19:22,913 [pulsar-io-28-63] DEBUG org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0xfb5df845, L:/10.20.69.37:6650 - R:/10.20.69.37:51376]] Replying back to ping message May 6 22:19:23 fab08 a7d8f24158d5: 2021-05-07T04:19:23,148 [pulsar-io-28-65] DEBUG org.apache.pulsar.common.protocol.PulsarDecoder - [/10.20.69.28:54664] Received cmd PING May 6 22:19:23 fab08 a7d8f24158d5: 2021-05-07T04:19:23,148 [pulsar-io-28-65] DEBUG org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0xf14980c1, L:/10.20.69.37:6650 - R:/10.20.69.28:54664]] Replying back to ping message May 6 22:19:23 fab08 a7d8f24158d5: 2021-05-07T04:19:23,148 [pulsar-io-28-65] DEBUG org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0xf14980c1, L:/10.20.69.37:6650 - R:/10.20.69.28:54664]] Sending ping message May 6 22:19:23 fab08 a7d8f24158d5: 2021-05-07T04:19:23,150 [pulsar-io-28-65] DEBUG org.apache.pulsar.common.protocol.PulsarDecoder - [/10.20.69.28:54664] Received cmd PONG May 6 22:19:23 fab08 a7d8f24158d5: 2021-05-07T04:19:23,540 [pulsar-io-28-13] DEBUG org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0x3e7436d4, L:/10.20.69.37:6651 - R:/10.20.17.40:51699]] Sending ping message May 6 22:19:23 fab08 a7d8f24158d5: 2021-05-07T04:19:23,802 [pulsar-io-28-13] DEBUG org.apache.pulsar.common.protocol.PulsarDecoder - [/10.20.17.40:51699] Received cmd PING May 6 22:19:23 fab08 a7d8f24158d5: 2021-05-07T04:19:23,802 [pulsar-io-28-13] DEBUG org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0x3e7436d4, L:/10.20.69.37:6651 - R:/10.20.17.40:51699]] Replying back to ping message May 6 22:19:23 fab08 a7d8f24158d5: 2021-05-07T04:19:23,802 [pulsar-io-28-13] DEBUG org.apache.pulsar.common.protocol.PulsarDecoder - [/10.20.17.40:51699] Received cmd PONG ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
