lhotari edited a comment on pull request #11343:
URL: https://github.com/apache/pulsar/pull/11343#issuecomment-884747001


   @jerrypeng 
   
   > Why not? 
   
   For the thread runtime I meant that it would have to be for each function 
instance also in that case and a JVM level unique directory isn't sufficient. 
   
   It would be possible to have a unique directory for every parallel instance 
of the function, but the problem that it brings is the **excessive disk space 
usage**. Each parallel instance would have to have it's own unique directory. 
Some NAR/JAR files are very large. 50MB isn't uncommon. for example, 
pulsar-io-debezium-*.nar files are over 80MB (there are many others in this 
range). With a high parallelism value such as 8, several of hundreds of 
additional MBs of disk space might be required if the solution would be based 
on a unique directory per function instance. 
   
   > Even with your solution, it does not prevent the scenario when two 
different functions are started by submitting distinct NARs/JARs that is named 
the same.
   
   This PR addresses the case where the NARs or JARs use the same name. The 
content hash will be used as part of the directory name. 
   The file locking solution prevents race conditions which could happen when 
parallel functions start up at the same time (very common scenario when 
parallelism > function workers and using process runtime).
   
   Is there some problem with the changes proposed in the PR? I'd like to 
complete work on this PR before my vacation which starts tomorrow.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to