GitHub user poetwang opened a pull request:
https://github.com/apache/curator/pull/261
Fix race condition on cache in ServiceCacheImpl
There is a race condition on variable cache, both the main thread(from
start) and the "Curator-ServiceProvider" thread(from childEvent->
addInstance->cache.clearDataBytes) try to access the data in cache.
The following case will cause NPE in sonInstanceSerializer:
main thread finishes the cache.start(true);
and then an event(CHILD_ADDED or CHILD_UPDATED) comes,
it(Curator-ServiceProvider thread) will try to clear the cache data(in
addInstance).
Then main thread continues to call addInstance on cached data, which is
null. It will cause NPE on line 193.
I think this is the cause of
https://stackoverflow.com/questions/42007102/apache-curator-npe-in-jsoninstanceserializer
.
And it causes the druid-0.11.0 indexing tasks fail to start.
`
2018-03-22T12:44:17,004 ERROR [main] io.druid.cli.CliPeon - Error when
starting up. Failing.
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
~[?:1.8.0_121]
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
~[?:1.8.0_121]
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:1.8.0_121]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_121]
at
io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler.start(Lifecycle.java:414)
~[java-util-0.11.0.jar:0.11.0]
at
io.druid.java.util.common.lifecycle.Lifecycle.start(Lifecycle.java:311)
~[java-util-0.11.0.jar:0.11.0]
at io.druid.guice.LifecycleModule$2.start(LifecycleModule.java:156)
~[druid-api-0.11.0.jar:0.11.0]
at io.druid.cli.GuiceRunnable.initLifecycle(GuiceRunnable.java:101)
[druid-services-0.11.0.jar:0.11.0]
at io.druid.cli.CliPeon.run(CliPeon.java:283)
[druid-services-0.11.0.jar:0.11.0]
at io.druid.cli.Main.main(Main.java:108)
[druid-services-0.11.0.jar:0.11.0]
Caused by: java.lang.NullPointerException
at
org.codehaus.jackson.JsonFactory.createJsonParser(JsonFactory.java:604)
~[jackson-core-asl-1.9.13.jar:1.9.13]
at
org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1973)
~[jackson-mapper-asl-1.9.13.jar:1.9.13]
at
org.apache.curator.x.discovery.details.JsonInstanceSerializer.deserialize(JsonInstanceSerializer.java:86)
~[curator-x-discovery-4.0.0.jar:?]
at
org.apache.curator.x.discovery.details.ServiceCacheImpl.addInstance(ServiceCacheImpl.java:200)
~[curator-x-discovery-4.0.0.jar:?]
at
org.apache.curator.x.discovery.details.ServiceCacheImpl.start(ServiceCacheImpl.java:102)
~[curator-x-discovery-4.0.0.jar:?]
at
org.apache.curator.x.discovery.details.ServiceProviderImpl.start(ServiceProviderImpl.java:75)
~[curator-x-discovery-4.0.0.jar:?]
at
io.druid.curator.discovery.ServerDiscoverySelector.start(ServerDiscoverySelector.java:132)
~[druid-server-0.11.0.jar:0.11.0]
... 10 more
`
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/poetwang/curator fix-cache-race-condition
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/curator/pull/261.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #261
----
commit abf78960c7000478e90d220845d0eda873ec3c7e
Author: Wei Wang <wei.w@...>
Date: 2018-03-22T18:31:13Z
Fix race condition on cache in ServiceCacheImpl
----
---