ianmcook commented on issue #3765:
URL: https://github.com/apache/arrow-adbc/issues/3765#issuecomment-3635178411

   Hi @davidhcoe, thanks for the nudge. I had a very long chat with @lidavidm 
and @zeroshade about this today. Here’s what we’re thinking:
   
   - It seems unambiguously necessary for ADBC to provide a standard way of 
getting the connection configs out of user/application code and into a separate 
file.
     - We expect that there will be strong agreement from the Arrow/ADBC 
developer and user communities on this.
   - After much thinking and discussion, I propose that we use the name 
_**"connection profiles"**_ to refer to this.
   - The successful recent effort to add a [driver manifests 
spec](https://arrow.apache.org/adbc/current/format/driver_manifests.html#driver-manifests)
 and associated implementations to ADBC can serve as a guide for how to add a 
connection profiles spec and associated implementations.
   - There’s nice symmetry here between ADBC and ODBC. We added driver 
manifests (which are analogous to `odbcinst.ini`) and now we’re adding 
connection profiles (which are analogous to the ODBC DSNs stored in `odbc.ini`).
       - Just like with ODBC DSNs, each ADBC connection profile will be 
required to specify which driver it’s for.
   - We will draft a doc with technical specifics, and circulate it on the 
Arrow developer mailing list, but here’s what we’re thinking:
       - We should establish specified locations where connection profiles are 
stored:
           - `%LOCAL_APPDATA%\ADBC\Profiles` on Windows
           - `~/Library/Application Support/ADBC/Profiles` on macOS
           - `$XDG_CONFIG_HOME/adbc/profiles` on Linux
       - We should define an environment variable `ADBC_PROFILE_PATH` that 
overrides the above and takes precedence.
       - We should define different search behaviors when running inside venv 
and Conda environments for some of the driver managers (to allow virtual 
environment/per-project isolation).
       - Unlike with the driver manifests, I don’t think we need a system-level 
config directory for connection profiles.
       - Unlike with the driver manifests, I don’t think we need to use the 
Windows Registry to store connection profiles on Windows.
       - Inside the connection profiles directory, each connection profile is a 
separate TOML file.
           - TOML seems like a good choice for this, and we already added a 
TOML parser the driver managers for driver manifests.
           - Note the difference from `odbc.ini` here. With `odbc.ini`, all the 
DSNs are defined in a single file. With ADBC connection profiles, each 
connection profile is defined in its own separate TOML file.
           - The name of each `.toml` file is the name of the connection 
profile.
               - We can define rules about the allowed character set, case 
insensitivity, etc. for these filenames.
       - Within the connection profiles directory, you can add subdirectories 
to create a hierarchical structure in which to organize the connection profile 
TOML files.
           - This will be convenient for defining separate `dev`, `test`, 
`prod` connection profile sets, etc.
       - Each connection profile TOML file will be pretty simple:
           - It will refer to the driver that it’s compatible with.
               - Using the current ADBC driver name search behavior.
           - It will contain key-value pairs matching the name and values of 
the connection arguments that the drivers expect.
               - The key names will not be standardized by the spec; they will 
be defined by the requirements of the individual drivers.
           - There will also be a connection profile version number and some 
other optional metadata in the TOML file.
           - There will be a function `env_var` that can be used in the 
connection profile to look up values in environment variables.
               - This will be one of the recommended ways to pass secrets.
           - Later we plan to add other ways to securely store secrets and 
refer to them in the connection profile TOML.
               - For example, with secrets management tools like HashiCorp 
Vault et al. 
       - The ADBC driver managers will all be changed to accept a new argument 
named `profile` that can be used to pass a connection profile name.
           - If the user sets `profile` to `foo`, then the driver manager will 
search the directories described above for `foo.toml`.
           - If the user sets `profile` to `path/to/foo`, then the driver 
manager will search for a file named `foo.toml` in the relative path `path/to` 
in the directories described above.
           - If the user sets `profile` to a full absolute path, then the 
driver manager will use the connection profile file at that path.
       - The ADBC driver managers will also provide a way to specify which 
connection profile to use through URI.
           - This is a part of our recent effort to ensure that all the ADBC 
drivers can accept their configs through a single URI.
           - If the URI begins with `profile://`, then the driver manager will 
search the subsequently specified path for a connection profile as described 
above.
               - `profile` will therefore become a reserved word in ADBC; no 
driver will ever be allowed to be named `profile`. 
       - If the user/application code specifies additional connection arguments 
(as individual parameters passed to the driver manager, or in the URI passed to 
the driver manager), then those arguments will override the arguments specified 
in the connection profile.
           - This allows for one-off overrides of arguments specified in 
connection profiles, and filling in of arguments that are missing in connection 
profiles.
           - This also provides another way to pass secrets that shouldn’t be 
hard-coded as literal strings in the connection profile files.
   - As a part of this proposal, we also want to specify an abstract interface 
for connection profiles that can be used in the future to add other 
non-file-based connection profile providers.
   
   How does that sound?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to