ianmcook commented on issue #3765:
URL: https://github.com/apache/arrow-adbc/issues/3765#issuecomment-3635178411
Hi @davidhcoe, thanks for the nudge. I had a very long chat with @lidavidm
and @zeroshade about this today. Here’s what we’re thinking:
- It seems unambiguously necessary for ADBC to provide a standard way of
getting the connection configs out of user/application code and into a separate
file.
- We expect that there will be strong agreement from the Arrow/ADBC
developer and user communities on this.
- After much thinking and discussion, I propose that we use the name
_**"connection profiles"**_ to refer to this.
- The successful recent effort to add a [driver manifests
spec](https://arrow.apache.org/adbc/current/format/driver_manifests.html#driver-manifests)
and associated implementations to ADBC can serve as a guide for how to add a
connection profiles spec and associated implementations.
- There’s nice symmetry here between ADBC and ODBC. We added driver
manifests (which are analogous to `odbcinst.ini`) and now we’re adding
connection profiles (which are analogous to the ODBC DSNs stored in `odbc.ini`).
- Just like with ODBC DSNs, each ADBC connection profile will be
required to specify which driver it’s for.
- We will draft a doc with technical specifics, and circulate it on the
Arrow developer mailing list, but here’s what we’re thinking:
- We should establish specified locations where connection profiles are
stored:
- `%LOCAL_APPDATA%\ADBC\Profiles` on Windows
- `~/Library/Application Support/ADBC/Profiles` on macOS
- `$XDG_CONFIG_HOME/adbc/profiles` on Linux
- We should define an environment variable `ADBC_PROFILE_PATH` that
overrides the above and takes precedence.
- We should define different search behaviors when running inside venv
and Conda environments for some of the driver managers (to allow virtual
environment/per-project isolation).
- Unlike with the driver manifests, I don’t think we need a system-level
config directory for connection profiles.
- Unlike with the driver manifests, I don’t think we need to use the
Windows Registry to store connection profiles on Windows.
- Inside the connection profiles directory, each connection profile is a
separate TOML file.
- TOML seems like a good choice for this, and we already added a
TOML parser the driver managers for driver manifests.
- Note the difference from `odbc.ini` here. With `odbc.ini`, all the
DSNs are defined in a single file. With ADBC connection profiles, each
connection profile is defined in its own separate TOML file.
- The name of each `.toml` file is the name of the connection
profile.
- We can define rules about the allowed character set, case
insensitivity, etc. for these filenames.
- Within the connection profiles directory, you can add subdirectories
to create a hierarchical structure in which to organize the connection profile
TOML files.
- This will be convenient for defining separate `dev`, `test`,
`prod` connection profile sets, etc.
- Each connection profile TOML file will be pretty simple:
- It will refer to the driver that it’s compatible with.
- Using the current ADBC driver name search behavior.
- It will contain key-value pairs matching the name and values of
the connection arguments that the drivers expect.
- The key names will not be standardized by the spec; they will
be defined by the requirements of the individual drivers.
- There will also be a connection profile version number and some
other optional metadata in the TOML file.
- There will be a function `env_var` that can be used in the
connection profile to look up values in environment variables.
- This will be one of the recommended ways to pass secrets.
- Later we plan to add other ways to securely store secrets and
refer to them in the connection profile TOML.
- For example, with secrets management tools like HashiCorp
Vault et al.
- The ADBC driver managers will all be changed to accept a new argument
named `profile` that can be used to pass a connection profile name.
- If the user sets `profile` to `foo`, then the driver manager will
search the directories described above for `foo.toml`.
- If the user sets `profile` to `path/to/foo`, then the driver
manager will search for a file named `foo.toml` in the relative path `path/to`
in the directories described above.
- If the user sets `profile` to a full absolute path, then the
driver manager will use the connection profile file at that path.
- The ADBC driver managers will also provide a way to specify which
connection profile to use through URI.
- This is a part of our recent effort to ensure that all the ADBC
drivers can accept their configs through a single URI.
- If the URI begins with `profile://`, then the driver manager will
search the subsequently specified path for a connection profile as described
above.
- `profile` will therefore become a reserved word in ADBC; no
driver will ever be allowed to be named `profile`.
- If the user/application code specifies additional connection arguments
(as individual parameters passed to the driver manager, or in the URI passed to
the driver manager), then those arguments will override the arguments specified
in the connection profile.
- This allows for one-off overrides of arguments specified in
connection profiles, and filling in of arguments that are missing in connection
profiles.
- This also provides another way to pass secrets that shouldn’t be
hard-coded as literal strings in the connection profile files.
- As a part of this proposal, we also want to specify an abstract interface
for connection profiles that can be used in the future to add other
non-file-based connection profile providers.
How does that sound?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]