[
https://issues.apache.org/jira/browse/SPARK-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14338787#comment-14338787
]
Thomas Graves commented on SPARK-6017:
--------------------------------------
spark.authenticate.secret is only used in standalone mode, not on YARN. YARN
distributes in via UGI. See SecurityManager.scala line 347.
Yes some of those Hadoop interfaces aren't public and I believe I filed bugs
against those (https://issues.apache.org/jira/browse/HADOOP-10506). Although I
should probably pull out UGI specifically. there isn't any other way to
properly make it work with secure cluster without using those so I consider
that a bug on the hadoop side. You will notice that the UGI has
@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce", "HBase", "Hive",
"Oozie"}). We could add Spark to that list, but I thought it wasn't worth the
effort as if they do change something there it will break the world.
I would personally prefer sticking with 1 way of distribution - that being the
UGI. I don't see a reason to reinvent that unless of course its going to work
across all cluster configs (standalone, yarn, mesos, etc). Then it might make
sense.
> Provide transparent secure communication channel on Yarn
> --------------------------------------------------------
>
> Key: SPARK-6017
> URL: https://issues.apache.org/jira/browse/SPARK-6017
> Project: Spark
> Issue Type: Umbrella
> Components: YARN
> Reporter: Marcelo Vanzin
> Attachments: secure_spark_on_yarn.pdf
>
>
> A quick description:
> Currently driver and executors communicate through an insecure channel, so
> anyone can listen on the network and see what's going on. That prevents Spark
> from adding some features securely (e.g. SPARK-5342, SPARK-5682) without
> resorting to using internal Hadoop APIs.
> Spark 1.3.0 will add SSL support, but properly configuring SSL is not a
> trivial task for operators, let alone users.
> In light of those, we should add a more transparent secure transport layer.
> I've written a short spec to identify the areas in Spark that need work to
> achieve this, and I'll attach the document to this issue shortly.
> Note I'm restricting things to Yarn currently, because as far as I know it's
> the only cluster manager that provides the needed security features to
> bootstrap the secure Spark transport. The design itself doesn't really rely
> on Yarn per se, just on a secure way to distribute the initial secret (which
> the Yarn/HDFS combo provides).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]