massakam commented on issue #2289: Broker suddenly goes down URL: https://github.com/apache/incubator-pulsar/issues/2289#issuecomment-412267676 Updates: - This phenomenon occured only once in v2.0.1, but occured many times in v1.22.1. So each cause may be different. - v1.22.1 broker goes down when splitting and unloading a bundle. v2.0.1 and v1.21.0 do not go down. - The following is a thread dump right before v1.22.1 broker goes down. [threaddump.txt](https://github.com/apache/incubator-pulsar/files/2280143/threaddump.txt) - This phenomenon does not occur if v1.22.1 is modified as follows: ```diff --- a/pulsar-broker/src/main/java/org/apache/pulsar/broker/namespace/NamespaceService.java +++ b/pulsar-broker/src/main/java/org/apache/pulsar/broker/namespace/NamespaceService.java @@ -22,6 +22,7 @@ import static com.google.common.base.Preconditions.checkArgument; import static com.google.common.base.Preconditions.checkNotNull; import static java.lang.String.format; import static java.util.concurrent.TimeUnit.SECONDS; +import static org.apache.bookkeeper.mledger.util.SafeRun.safeRun; import static org.apache.pulsar.broker.cache.LocalZooKeeperCacheService.LOCAL_POLICIES_ROOT; import static org.apache.pulsar.broker.web.PulsarWebResource.joinPath; import static org.apache.pulsar.common.naming.NamespaceBundleFactory.getBundlesData; @@ -596,7 +597,7 @@ public class NamespaceService { checkNotNull(ownershipCache.tryAcquiringOwnership(sBundle)); } updateNamespaceBundles(nsname, splittedBundles.getLeft(), - (rc, path, zkCtx, stat) -> { + (rc, path, zkCtx, stat) -> pulsar.getOrderedExecutor().submit(safeRun(() -> { if (rc == Code.OK.intValue()) { // invalidate cache as zookeeper has new split // namespace bundle @@ -618,7 +619,7 @@ public class NamespaceService { LOG.warn(msg); updateFuture.completeExceptionally(new ServiceUnitNotReadyException(msg)); } - }); + }))); } catch (Exception e) { String msg = format("failed to acquire ownership of split bundle for namespace [%s], %s", nsname.toString(), e.getMessage()); ``` From the above, v1.22.1 broker goes down probably because the bug fixed by https://github.com/apache/incubator-pulsar/pull/115 has recurred with the following two changes: - https://github.com/apache/incubator-pulsar/pull/1710 - https://github.com/apache/incubator-pulsar/pull/1752
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
