Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20891
@jerryshao @ajbozarth also @vanzin told me the same in the JIRA. Honestly,
I think that it was an error to reject it in the past.
There is no other system which allows users without read permissions to
list what other users are doing. You can check the behavior of any DB for
instance (Postgres, Oracle, ...).
In this way, we are showing to all users which other users are on the
system, when they are running applications and so on. Some example of
information which users can know in this way, while they shouldn't:
- the name of the other users on the system (if I am a company and I have
2 consultant companies working on the same cluster, I might not want each
company to know that the other is working there too);
- if the name of the applications are explicit, I can understand what
another user is doing on the cluster, even though I do not have read
permissions for their applications; again I might know that a competitor
company is working on that cluster for doing a specific work....
- non-admin users can see how many and which users are currently using the
cluster.
All these things should not be disclosed to non-admin users. I think this
is especially critical in situation where a company has a cluster but many
other consulting companies are working on it. The one who owns the cluster is
likely not willing to disclose to its consultant which other consultant are
there and what they are doing. In this way, we are letting them know.
Moreover, it is semantically wrong. You are saying that a user has no read
permission for an application, but actually it can see that the application
exists, its name, duration, submitting user, and so on. So you can see some
details for something you have no read access to.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]