Hello,

I would like to expose Apache Spark to untrusted users (through Livy, and with 
a direct
JDBC connection).

However, there appear to be a variety of avenues wherein one of these untrusted 
users
can execute arbitrary code (by design): PySpark, SparkR, Jar uploads, various 
UDFs, etc.

I would like to prevent my untrusted users from executing arbitrary remote code.
I have found small bits of information relating to this[0][1], but nothing 
comprehensive
or prescriptive.

I understand that this is not exactly Spark’s use case, but any thoughts or 
opinions with
regards to this would be appreciated, and especially if there is an established 
process
for tending to this scenario.

Thanks.

Jack

0: 
https://stackoverflow.com/questions/38333873/securely-running-a-spark-application-inside-a-sandbox
1: 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.builtin.udf.whitelist


Reply via email to