I believe that to really protect the cluster from unauthorized use requires that the cluster endpoints (notably Akka) perform an authorization check. The 'secure flink’ design doc outlines various measures to achieve that.
Stefano I’ll reach out to have a sync-up meeting and to incorporate your feedback. Thanks! > On May 17, 2016, at 8:36 AM, Robert Metzger <rmetz...@apache.org> wrote: > > I'm not sure if doing the check in the CliFrontend is really effective. A > "hacker" could just create a custom flink build without that check and > still submit a job to the job manager. > > > > On Thu, May 5, 2016 at 2:51 PM, Stefano Baghino < > stefano.bagh...@radicalbit.io> wrote: > >> Apologies for being too generic: with "secure" cluster I mean a Flink >> cluster that has been launched with Kerberos credentials (both on YARN or >> with the standalone scheduler), thus having access to resources on the >> cluster that require authentication (like HDFS). >> >> Without having to run jobs on behalf of an authenticated user (which is >> another kind of problem), the facilities to perform a check that the >> submitter is authenticated are already in place >> (CliFrontend::parseParameters, the branch of the switch-case statement that >> handles the "run" command) and requiring a submission to come from an >> authenticated user should come almost for free. >> >> On Thu, May 5, 2016 at 1:18 PM, Robert Metzger <rmetz...@apache.org> >> wrote: >> >>> Hi Stefano, >>> >>> what exactly do you mean by a secure cluster? >>> A Flink on YARN session in a secured YARN cluster? >>> A standalone Flink cluster with access to a secured HDFS? >>> >>> Your observation is right. We are not check if a job submitted by any >> user >>> is running in the same security context as the Flink cluster. >>> >>> >>> On Thu, May 5, 2016 at 11:57 AM, Stefano Baghino < >>> stefano.bagh...@radicalbit.io> wrote: >>> >>>> Hello everybody, >>>> >>>> last week I've run some tests on a secure cluster and I noticed that an >>>> unauthenticated user can submit a Flink job that will only eventually >>> fail >>>> if the job tries to access secured resources (e.g. HDFS). This doesn't >>>> prevent however the user to consume resources of the secure cluster >>> without >>>> authentication (I tried it with the WordCount example). >>>> >>>> I'd say this is a bug; is there a reason for this? If you share my >>> feeling >>>> on this, I pinpointed the code that's responsible for this and the fix >>>> seems trivial, I can open an issue and a PR today. Thanks! >>>> >>>> -- >>>> BR, >>>> Stefano Baghino >>>> >>>> Software Engineer @ Radicalbit >>>> >>> >> >> >> >> -- >> BR, >> Stefano Baghino >> >> Software Engineer @ Radicalbit >>