The issue is two fold. First how do you authenticate to nimbus without
exposing the UIs credentials (That is the security issue that made me pull it
out). The second is does the topology get access to the credentials it needs
to run properly. This I think is the harder one to solve and instead of being
an obvious security hole it will force users to open up security holes in their
own code to make this work.
Your proposal only isolates the local user, it does nothing around getting the
needed credentials. We can isolate the local user already by launching the
submitted jar through the run-as-user functions that the supervisor already
uses. That is not the difficult part. If we really are serious about making
this work we will need to add in support for something similar to delegation
tokens when talking to nimbus, and other storm services like DRPC. The UI user
would then be able to ask nimbus to give it some short term credentials that
could then be given to the submission code. That code, running as the user
that submitted it, would then use the credentials in place of the TGT/service
ticket to authenticate with nimbus. That would fix the first concern.
On the nimbus side we already support delegation tokens for most of the Hadoop
ecosystem that uses them, So a topology that talks with HDFS/HBase would work
just fine this way. We would have to add in support to DRPC as well so the
topology could authenticate with the DRPC server, but the biggest issue I see
is that kafka does not support delegation tokens in any way at all right now.
The closest is comes to something like that is mutual authentication through
SSL. To make this work we would either need to add in delegation tokens to
kafka (which I think is in the works) or somehow configure nimbus to be able to
talk to a CA and generate/fetch certificates that clients could use to
communicate with kafka. The SSL CA stuff is just off the top of my head and I
honestly don't know how feasible that really would be, or have users include
those certs in with their topology.
If we want to start down the delegation token route I would be fine with it
even without kafka. - Bobby
On Sunday, September 18, 2016 7:34 PM, Jungtaek Lim <kabh...@gmail.com>
I just had this idea in mind a long time since I'm not sure it'll work
without security issue, but decide to share since upload topology via REST
API is still a great feature to have.
Assuming that no one uses Storm 0.10.0 beta, upload topology via REST API
is removed due to security reason.
If my understanding is right, main security issue is that client class
(which will configure the topology and submit) will be executed from the UI
server with same account.
What if we just submit pre-defined topology which just submits topology
with given topology jar?
It may be similar with cluster mode of driver of Spark but pre-defined
topology will be shutdown immediately after submitting actual topology.
Arbitrary codes can be run on worker node, but it would be running on one
of worker, which it can be protected with security features.
(I might be wrong since I might not have clear understanding of security
feature, especially submitting new topology from the worker.)
We don't need to mention about non-secured cluster, since it's just
non-secured. Even without this API, anyone can include arbitrary code in
Spout or Bolt and submit that topology.
Can this idea address security issue on upload topology REST API?
Thanks in advance,
Jungtaek Lim (HeartSaVioR)