GitHub user poetwang opened a pull request:

    https://github.com/apache/curator/pull/261

    Fix race condition on cache in ServiceCacheImpl

    There is a race condition on variable cache, both the main thread(from 
start) and the "Curator-ServiceProvider" thread(from childEvent-> 
addInstance->cache.clearDataBytes) try to access the data in cache.
    The following case will cause NPE in sonInstanceSerializer:
    main thread finishes the cache.start(true);
    and then an event(CHILD_ADDED or CHILD_UPDATED) comes, 
it(Curator-ServiceProvider thread) will try to clear the cache data(in 
addInstance).
    Then main thread continues to call addInstance on cached data, which is 
null. It will cause NPE on line 193.
    
    I think this is the cause of 
https://stackoverflow.com/questions/42007102/apache-curator-npe-in-jsoninstanceserializer
 .
    And it causes the druid-0.11.0 indexing tasks fail to start.
    `
    2018-03-22T12:44:17,004 ERROR [main] io.druid.cli.CliPeon - Error when 
starting up.  Failing.
    java.lang.reflect.InvocationTargetException
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_121]
            at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_121]
            at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_121]
            at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_121]
            at 
io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler.start(Lifecycle.java:414)
 ~[java-util-0.11.0.jar:0.11.0]
            at 
io.druid.java.util.common.lifecycle.Lifecycle.start(Lifecycle.java:311) 
~[java-util-0.11.0.jar:0.11.0]
            at io.druid.guice.LifecycleModule$2.start(LifecycleModule.java:156) 
~[druid-api-0.11.0.jar:0.11.0]
            at io.druid.cli.GuiceRunnable.initLifecycle(GuiceRunnable.java:101) 
[druid-services-0.11.0.jar:0.11.0]
            at io.druid.cli.CliPeon.run(CliPeon.java:283) 
[druid-services-0.11.0.jar:0.11.0]
            at io.druid.cli.Main.main(Main.java:108) 
[druid-services-0.11.0.jar:0.11.0]
    Caused by: java.lang.NullPointerException
            at 
org.codehaus.jackson.JsonFactory.createJsonParser(JsonFactory.java:604) 
~[jackson-core-asl-1.9.13.jar:1.9.13]
            at 
org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1973) 
~[jackson-mapper-asl-1.9.13.jar:1.9.13]
            at 
org.apache.curator.x.discovery.details.JsonInstanceSerializer.deserialize(JsonInstanceSerializer.java:86)
 ~[curator-x-discovery-4.0.0.jar:?]
            at 
org.apache.curator.x.discovery.details.ServiceCacheImpl.addInstance(ServiceCacheImpl.java:200)
 ~[curator-x-discovery-4.0.0.jar:?]
            at 
org.apache.curator.x.discovery.details.ServiceCacheImpl.start(ServiceCacheImpl.java:102)
 ~[curator-x-discovery-4.0.0.jar:?]
            at 
org.apache.curator.x.discovery.details.ServiceProviderImpl.start(ServiceProviderImpl.java:75)
 ~[curator-x-discovery-4.0.0.jar:?]
            at 
io.druid.curator.discovery.ServerDiscoverySelector.start(ServerDiscoverySelector.java:132)
 ~[druid-server-0.11.0.jar:0.11.0]
            ... 10 more
    `


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/poetwang/curator fix-cache-race-condition

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/curator/pull/261.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #261
    
----
commit abf78960c7000478e90d220845d0eda873ec3c7e
Author: Wei Wang <wei.w@...>
Date:   2018-03-22T18:31:13Z

    Fix race condition on cache in ServiceCacheImpl

----


---

Reply via email to