[GitHub] [pulsar] massakam opened issue #2666: Specific topic becomes unavailable due to failure of cursor recovery

GitHub Wed, 26 Sep 2018 22:29:31 -0700

Currently, specific topic is unavailable in our Pulsar cluster. We can not 
send/receive messages to/from the topic, and HTTP requests to get 
stats/stats-internal timed-out. We don't know how to reproduce this phenomenon.


The info-internal of that topic is as follows:
[info-internal.txt](https://github.com/apache/pulsar/files/2422553/info-internal.txt)

The following error was output to the broker log. It seems to have failed to 
recover the cursor due to invalid range error.

```
08:58:51.367 [bookkeeper-ml-workers-OrderedExecutor-7-0] INFO  
o.a.b.mledger.impl.ManagedCursorImpl - [xxxx/global/yyyy/persistent/zzzz] 
Cursor sub.30 recovered to position 1952728:-1
08:58:51.368 [bookkeeper-ml-workers-OrderedExecutor-7-0] ERROR 
o.a.b.common.util.SafeRunnable       - Unexpected throwable caught
java.lang.IllegalArgumentException: Invalid range: (1952728:-1..65425:1]
        at com.google.common.collect.Range.<init>(Range.java:352) 
~[guava-21.0.jar:na]
        at com.google.common.collect.Range.create(Range.java:146) 
~[guava-21.0.jar:na]
        at com.google.common.collect.Range.openClosed(Range.java:194) 
~[guava-21.0.jar:na]
        at 
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.recoveredCursor(ManagedCursorImpl.java:358)
 ~[managed-ledger-original-2.1.1-incubating.jar:2.1.1-incubating]
        at 
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.access$300(ManagedCursorImpl.java:90)
 ~[managed-ledger-original-2.1.1-incubating.jar:2.1.1-incubating]
        at 
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl$1.operationComplete(ManagedCursorImpl.java:244)
 ~[managed-ledger-original-2.1.1-incubating.jar:2.1.1-incubating]
        at 
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl$1.operationComplete(ManagedCursorImpl.java:218)
 ~[managed-ledger-original-2.1.1-incubating.jar:2.1.1-incubating]
        at 
org.apache.bookkeeper.mledger.impl.MetaStoreImplZookeeper.lambda$null$7(MetaStoreImplZookeeper.java:240)
 ~[managed-ledger-original-2.1.1-incubating.jar:2.1.1-incubating]
        at 
org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) 
~[managed-ledger-original-2.1.1-incubating.jar:2.1.1-incubating]
        at 
org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) 
~[bookkeeper-common-4.7.2.jar:4.7.2]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[na:1.8.0_181]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[na:1.8.0_181]
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 [netty-all-4.1.22.Final.jar:4.1.22.Final]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
```

I have a couple of questions:

- Is this caused by incorrect metadata on ZK? Or is it a bug in managed ledger 
related classes?
- How can we restore this topic to normal?

#### System configuration
**Pulsar version**: 2.1.1-incubating

[ Full content available at: https://github.com/apache/pulsar/issues/2666 ]
This message was relayed via gitbox.apache.org for [email protected]

[GitHub] [pulsar] massakam opened issue #2666: Specific topic becomes unavailable due to failure of cursor recovery

Reply via email to