addu390 opened a new pull request, #362: URL: https://github.com/apache/doris-spark-connector/pull/362
# Proposed changes Issue Number: close #341 ## Problem Summary: Doris `ARRAY` columns are currently exposed to Spark as `StringType`, so SQL functions like `size(col)` fail with `DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE`. This PR adds an opt-in config `doris.read.array.native-type` (default `false`) that surfaces them as `ArrayType(StringType)` instead. Covers both `thrift` and `arrow` read modes. Default-off keeps existing users on the legacy JSON-string behavior. Element-type inference (e.g. `array<int>` → `IntegerType`) is intentionally out of scope and left as a follow-up. ## Checklist(Required) 1. Does it affect the original behavior: No (gated by a default-off config) 2. Has unit tests been added: Yes (`SchemaConvertors`, `RowConvertors`, plus parameterized IT for both read modes) 3. Has document been added or modified: Yes (option description); upstream docs at `apache/doris` will be a follow-up 4. Does it need to update dependencies: No 5. Are there any changes that cannot be rolled back: No ## Further comments Built on top of the discussion in #341 and the design feedback on the stale #345. Happy to iterate on the option name or scope. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
