merlimat opened a new pull request #7508:
URL: https://github.com/apache/pulsar/pull/7508


   ### Motivation
   
   There is the possibility of deadlock if a request to create function is 
received while the leader is still tailing the topic with the updates. 
   
   The leader is trying to get the metadata managed lock
   ```
   "function-metadata-tailer-thread" #379 prio=5 os_prio=31 
tid=0x00007ffae8c58800 nid=0x2fb03 waiting for monitor entry 
[0x000070001a56a000]
      java.lang.Thread.State: BLOCKED (on object monitor)
        at 
org.apache.pulsar.functions.worker.FunctionMetaDataManager.processUpdate(FunctionMetaDataManager.java:415)
        - waiting to lock <0x000000078c803368> (a 
org.apache.pulsar.functions.worker.FunctionMetaDataManager)
        at 
org.apache.pulsar.functions.worker.FunctionMetaDataManager.processUncompactedMetaDataTopicMessage(FunctionMetaDataManager.java:331)
        at 
org.apache.pulsar.functions.worker.FunctionMetaDataManager.processMetaDataTopicMessage(FunctionMetaDataManager.java:316)
        at 
org.apache.pulsar.functions.worker.FunctionMetaDataTopicTailer.run(FunctionMetaDataTopicTailer.java:80)
        at java.lang.Thread.run(Thread.java:748)
   ```
   
   But that is held by the REST handler thread, which is also waiting for the 
leader to be ready: 
   
   ```
   "pulsar-web-73-14" #312 prio=5 os_prio=31 tid=0x00007ffb08ebd000 nid=0x34503 
waiting on condition [0x000070001619e000]
      java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at 
org.apache.pulsar.functions.worker.LeaderService.waitLeaderInit(LeaderService.java:155)
        at 
org.apache.pulsar.functions.worker.SchedulerManager.scheduleInternal(SchedulerManager.java:201)
        at 
org.apache.pulsar.functions.worker.SchedulerManager.schedule(SchedulerManager.java:229)
        at 
org.apache.pulsar.functions.worker.FunctionMetaDataManager.updateFunctionOnLeader(FunctionMetaDataManager.java:242)
        - locked <0x000000078c803368> (a 
org.apache.pulsar.functions.worker.FunctionMetaDataManager)
        at 
org.apache.pulsar.functions.worker.rest.api.ComponentImpl.internalProcessFunctionRequest(ComponentImpl.java:1547)
        at 
org.apache.pulsar.functions.worker.rest.api.ComponentImpl.updateRequest(ComponentImpl.java:863)
        at 
org.apache.pulsar.functions.worker.rest.api.FunctionsImpl.registerFunction(FunctionsImpl.java:231)
        at 
org.apache.pulsar.broker.admin.impl.FunctionsBase.registerFunction(FunctionsBase.java:174)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   ```
   
   ### Modification
   
   Reduce the scope of the mutex to exclude the call to 
`chedulerManager.schedule()`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to