vandop opened a new issue, #3134:
URL: https://github.com/apache/arrow-adbc/issues/3134

   ### What happened?
   
   # Go ADBC Schema validation strictness causes failures with Dremio
   
   ## Summary
   
   The Go ADBC FlightSQL driver performs strict schema validation that fails 
when distributed query engines like Dremio return data from multiple endpoints 
with minor schema inconsistencies (e.g., nullable differences). This appears to 
be overly strict for real-world distributed systems where schema inference at 
planning time may not match the actual runtime schema.
   
   ## Problem Description
   
   ### Current Behavior
   
   1. (`readInfo` function): Validates against GetFlightInfo returned schema
   
   ### Failure Scenario
   When querying Dremio (and likely other distributed engines), queries fail 
with errors like:
   ```
   endpoint 0 returned inconsistent schema: expected schema:
     fields: 1
       - test_value: type=int32 
   but got schema:
     fields: 1
       - test_value: type=int32, nullable
   ```
   
   ### Root Cause
   Dremio cannot always guarantee that the schema inferred (obtained during 
GetFlightInfo) at planning time matches the actual schema returned by each 
execution endpoint.
   
   ## Impact
   
   This affects real-world usage where:
   - Simple queries that work in native SQL clients fail through ADBC, or at 
least through the paths that enforce this strictness.
   
   ## Possible Solution
   
   ###  Configuration-based relaxed validation
   Add driver options to control schema validation strictness:
   
   ```python
   # Skip schema validation entirely
   conn = dbapi.connect(uri, db_kwargs={
       "adbc.flight.sql.skip_schema_validation": "true"
   })
   
   # Relaxed validation (ignore nullable differences)
   conn = dbapi.connect(uri, db_kwargs={
       "adbc.flight.sql.relaxed_schema_validation": "true"
   })
   ```
   
   ## Questions for Maintainers
   
   1. **Is the current strict validation intentional** for data integrity 
reasons, or is it an implementation artifact?
   
   2. **Would configurable validation be acceptable** to balance data integrity 
with real-world compatibility?
   
   3. **Are there existing patterns** in other ADBC drivers for handling schema 
inconsistencies?
   
   ## Environment
   
   - **ADBC Go/Python Version**: 19
   - **Server**: Dremio
   - **Language**: Go and Python
   
   ## Code References
   
   - Schema validation in metadata operations: 
`go/adbc/driver/flightsql/flightsql_connection.go:709`
   
   ## Workarounds
   
   Not known, executing the query success, but reading the results does trigger 
the strictness error both in Go and Python.
   
   
   
   
   ### Stack Trace
   
   _No response_
   
   ### How can we reproduce the bug?
   
   1. Should be as easy as trigger a Dremio instance (docker for instance)
   2. Execute SHOW SCHEMAS or SELECT 1 "Test" and try to print the results
   
   ### Environment/Setup
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to