[GitHub] [arrow-adbc] krlmlr commented on pull request #18: Add table/column reflection
krlmlr commented on PR #18: URL: https://github.com/apache/arrow-adbc/pull/18#issuecomment-1160714111 Yes, that looks about right -- and is still createable with *one single query* on DuckDB and perhaps Postgres. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-adbc] krlmlr commented on pull request #18: Add table/column reflection
krlmlr commented on PR #18: URL: https://github.com/apache/arrow-adbc/pull/18#issuecomment-1160597922 This is similar to what MySQL does -- others (like Postgres, SQL Server, DuckDB (?)) would store the `fk_` columns in a detail table `constraint_column_usage` (we would use a list of structs here). I have found that approach to be slightly more natural for interpreting the relationships later on. The format you propose here is more like a "wide" form, the Postgres version would be a "longer" form. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-adbc] krlmlr commented on pull request #18: Add table/column reflection
krlmlr commented on PR #18: URL: https://github.com/apache/arrow-adbc/pull/18#issuecomment-1159752806 Great! I think I can come up with a SQL query that returns this result for Postgres right from the database, but this is useful only if we support getting nested data from the database. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-adbc] krlmlr commented on pull request #18: Add table/column reflection
krlmlr commented on PR #18: URL: https://github.com/apache/arrow-adbc/pull/18#issuecomment-1159732527 I wonder if we also should offer a way to expose primary and foreign key, and perhaps unique constraints. This allows learning the data model from a remote database. I have devised a "minimum variant" of a sane `INFORMATION_SCHEMA` that allows storing this information. Of course no database (except perhaps DuckDB) implements this schema, tweaks are needed everywhere else. Essentially this consists of three new views, linked 1:n: `tables` <- `table_constraints` <- `key_column_usage` <-`constraint_column_usage` . This would only require an extra column in our "tables" element, implementers could leave it empty. What do you think? I'd be excited to see it here, happy to provide guidance if we agree to move forward. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-adbc] krlmlr commented on pull request #18: Add table/column reflection
krlmlr commented on PR #18: URL: https://github.com/apache/arrow-adbc/pull/18#issuecomment-1158246255 Fair enough, we can go with a static output schema for type stability. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-adbc] krlmlr commented on pull request #18: Add table/column reflection
krlmlr commented on PR #18: URL: https://github.com/apache/arrow-adbc/pull/18#issuecomment-1158037665 I'm not sure how difficult unpacking will be, and perhaps if filtering by catalog name we would only get ``` schema_name: utf8 schema_tables: list<...> ``` because the catalog name will be fixed. TBD. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-adbc] krlmlr commented on pull request #18: Add table/column reflection
krlmlr commented on PR #18: URL: https://github.com/apache/arrow-adbc/pull/18#issuecomment-1158022230 Thanks. I wonder if we need that many entry points. If we assume a hierarchy like catalog -> schema -> table -> column, could we have one single entry point that returns a nested data structure of (potentially) all catalogs with all schemas, tables, and columns (all nested)? We would offer two ways of filtering: - filter by depth: return details on the catalog, schema, table, or column level - filter by name: catalog name, then perhaps schema name, then perhaps table name, then perhaps column name Many databases with `information_schema` could implement all of this with a single query plus a local nesting operation, Postgres + DuckDB could do a single nested query. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org