Hi all,

I’d like to confirm that the current behaviour of config option 
“livy.server.session.max-creation”  is correct or not.

For example, I set the value of this option to “1” and try to submit 5 batch 
jobs (example SparkPi) on Livy almost in the same time using a command like 
this: 
“for i in {1..5}; do curl -X POST --data '{"file": "/tmp/spark-examples-2.jar", 
"className": "org.apache.spark.examples.SparkPi"}' -H "Content-Type: 
application/json" http://localhost:8999/batches; done”. 

In such case, I expect that the only one job (first) will be submitted properly 
and others will be rejected with a response message “Rejected, too many 
sessions are being created!”. This is what I actually have when I submit them 
manually, one by one. Though, in the case given above, almost always all 5 jobs 
will be submitted properly (sometimes only one will be rejected) despite of the 
value of config option.

Looks like, that code, when Livy checks the number of already created sessions 
and launch new one, is not atomic:
https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/server/src/main/scala/org/apache/livy/server/SessionServlet.scala#L130
 
<https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/server/src/main/scala/org/apache/livy/server/SessionServlet.scala#L130>

Also, the calculation of the total number of child processes in 
"tooManySessions()" is not atomic operation.

So, when I submit the jobs almost in the same moment (POST request takes only 
about tens of milliseconds) then I can spawn more spark-submit processes 
despite of maximum limitation. 

If it’s a bug then I will create a Jira and try to fix it.



Reply via email to