Hi devs,

I'm working (at Stratio)  on use spark over mesos and standalone, with a
kerberized HDFS

We are working to solve these scenarios,


   - We have an long term running spark sql context using concurrently by
   many users like Thrift server called CrossData, we need access to hdfs data
   with kerberos authorization using proxy-user method. we trust on HDFS
   permission system, or our custom authorizer.


   - We need load/write dataframes using datasources with HDFS
   backend(built-in, or others)  such json, csv, parquet, orc …, so we want to
   enable the secure access (krb)  only by configuration.


   - We have an scenario where we want to run streaming jobs over
   kerberized HDFS,  from W/R and  checkpointing too.


   - We have to load every single RDD that spark core over kerberized HDFS
   without breaking the Spark API.




As you can see, We have a "special" requirement need to set the proxy user
by job over the same spark context.

Do you have any idea to cover it?

Reply via email to