Hi fellow devs, As I am sure most of you are aware, AsterixDB today basically doesn't have auth. You can enable Mutual TLS, and there's basic auth for Python UDFs, but that's it. I don't think this is because it was off the radar, rather it was rolled into integrating into some wider framework in a deployment ecosystem (YARN,etc.) that would handle parts of this for us or have particulars that should be focused on first.
However I think the time has come to consider some of these things outside of that aspect. As it stands I think this a glaring impediment to the adoption of AsterixDB; many times I have had to give the rather unsatisfying answer of "Put nginx infront of it" or something to that effect. There is also the related but separate issue of API keys for external datasets currently being part of DDLs and Metadata, which is readable to everyone. What I think has changed, it is now the case that many of the building blocks for a robust implementation already exist now in the code (e.g. Janhavi's work on entity owners). Now all that needs to be added are a few missing pieces and I think this problem can be tackled in a sensible way, without reinventing a lot of things. The idea I have in my head at this point is to do something like this as a set of first steps: 1. As a simple first step, simply allow use BasicAuthServlet for all APIs instead of just libraries. Invoke the asterixhelper utility with start-sample-cluster.sh to write the passwd file on first start. 2. Allow the tying of the 'Owner" field in entities to entries in the passwd file through the HTTP api. 3. Add a "Secrets" Metadata Dataset that initially can only be modified or directly viewed by the system user, that will store things like API keys that currently have to be added as part of a DDL. 4. Make it so by default, only the system user can create or destroy entities. 5. Add a "Users" metadata dataset. This will basically replace the passwd file in a more robust way. Users can support the same mode (i.e. a simple username, with a salted hash). A default username and password can be generated on metadata bootstrap and written as a file. 6. Let Users be identified through OpenID Connect, and define a basic set of permissions and roles that can be assigned to users. Handle initial auth similar to Gerrit (the first login is admin). Allow cc.conf to define what providers should be allowed. Interested to hear anyone's thoughts or suggestions. Certainly this isn't a complete or exhaustive list of what should eventually be done, so additions are quite welcome. Thanks, -Ian
