fwojciec commented on issue #1755: URL: https://github.com/apache/arrow-adbc/issues/1755#issuecomment-3193941046
Not sure this is an option for ADBC (likely not, because it comes with at least one somewhat significant tradeoff) but maybe something that could be opted into via config? It's possible to do schema introspection at query time with minimal overhead (benchmarked at ~178 microseconds per query on local connections) using PREPARE statements - example Go implementation from one of my projects using Go's pgx as the postgres driver: ```go // GetQueryMetadata uses PREPARE to extract column metadata without executing the query. func (p *Pool) GetQueryMetadata(ctx context.Context, conn *pgxpool.Conn, sql string) (*arrow.Schema, []uint32, error) { // Generate a unique statement name to avoid collisions in concurrent usage stmtName := fmt.Sprintf("pgarrow_meta_%p", conn) sd, err := conn.Conn().Prepare(ctx, stmtName, sql) if err != nil { return nil, nil, fmt.Errorf("failed to prepare statement for metadata discovery: %w", err) } defer func() { _, _ = conn.Conn().Exec(ctx, "DEALLOCATE "+stmtName) }() if len(sd.Fields) == 0 { return nil, nil, fmt.Errorf("query returned no columns - Arrow conversion requires queries that return at least one column") } columns := make([]ColumnInfo, len(sd.Fields)) fieldOIDs := make([]uint32, len(sd.Fields)) for i, field := range sd.Fields { columns[i] = ColumnInfo{ Name: field.Name, OID: field.DataTypeOID, } fieldOIDs[i] = field.DataTypeOID } schema, err := CreateSchema(columns) if err != nil { return nil, nil, &SchemaError{ Columns: columns, Err: err, } } return schema, fieldOIDs, nil } ``` The main tradeoff is that this method doesn't detect column nullability - which I've personally found to be a good tradeoff since there's no performance benefit to knowing nullability when converting data (as far as I know) and the query engine I was using at the time (DuckDB) treats all Arrow data as nullable anyway (which arguably also follows Arrow's design philosophy). Just sharing because it might be something to consider in this context - no idea how to write it in C++ though... :/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org