andygrove opened a new issue, #2766:
URL: https://github.com/apache/datafusion-comet/issues/2766
### What is the problem the feature request solves?
There are currently gaps in testing for casts involving complex types. This
issue is for adding tests to cover all casts thet Comet claims to support.
### Describe the potential solution
[This section is AI generated and not verified to be accurate]
Summary of Current Cast Test Coverage
ARRAY Type Casts
Currently Tested (in CometCastSuite.scala:1050-1073):
- ✅ Array → String (tested with various element types: Boolean, String,
Byte, Integer, Long, Short, Decimal(10,2), Decimal(38,18))
Implemented but NOT tested:
- ❌ Array → Array (with element type casting) - Implemented in
cast.rs:1034-1036
- ❌ Null → Array - Marked as compatible in cast.rs:1141
STRUCT Type Casts
Currently Tested (in CometCastSuite.scala:959-1017):
- ✅ Struct → String (with primitive types, decimals, dates, timestamps,
nested structs)
- ✅ Struct → Struct (with field type changes)
- ✅ Struct → Struct (with different field names)
Well covered - The struct casting tests are comprehensive.
MAP Type Casts
Currently Tested:
- ❌ NO TESTS AT ALL - MapType is completely missing from cast tests
Implementation Status:
- Maps are NOT implemented in cast.rs (no Map-related cast logic found)
- CometCast.scala does NOT support map casting (not in the isSupported
method)
---
Identified Testing Gaps
Critical Gaps (High Priority)
1. Map Type Casting - Completely Missing
- No tests for any map casting operations
- Map → String (if Spark supports it)
- Map → Map (with key/value type changes)
- This appears to be completely unsupported, which should be documented
2. Array → Array Casting
- Gap: Element type transformations are implemented but not tested
- Examples needed:
- CAST(array(1, 2, 3) AS array<string>) - Int array to String array
- CAST(array(1.5, 2.5) AS array<int>) - Float array to Int array (with
rounding)
- CAST(array('1', '2', '3') AS array<int>) - String array to Int array
(parsing)
- Nested arrays: array<array<int>> to array<array<string>>
3. Array of Structs → String
- Not currently tested
- Example: array(struct(1, 'a'), struct(2, 'b')) cast to string
4. Struct with Array Fields → Struct
- Not explicitly tested with array field type changes
- Example: struct<a: array<int>> → struct<a: array<string>>
5. Null Handling in Complex Types
- Arrays with null elements being cast
- Structs with null fields being cast
- Null arrays/structs themselves
Medium Priority Gaps
6. Array with Binary Elements → String
- Explicitly marked as Incompatible in CometCast.scala:119-120
- Should have a test verifying this incompatibility
7. Arrays of Timestamps/Dates → Arrays of Other Types
- Test temporal type array transformations
- Example: array<timestamp> → array<string>
8. Edge Cases for Array Casts
- Empty arrays: CAST(array() AS array<string>)
- Single element arrays
- Large arrays (performance/correctness)
- Arrays with all nulls
9. Struct with Map Fields
- If/when map casting is supported
- struct<a: int, b: map<string, int>> casting
10. ANSI Mode Testing for Complex Types
- Array/Struct casts with invalid conversions in ANSI mode
- Error propagation from element casts
Low Priority Gaps
11. Dictionary-encoded Complex Types
- Dictionary-encoded arrays/structs
- The test suite has some dictionary support for primitives
12. Deep Nesting
- array<array<array<int>>> → array<array<array<string>>>
- struct<a: struct<b: struct<c: int>>> with type changes
---
Recommendations
Immediate Actions
1. Add Array → Array tests covering:
test("cast ArrayType to ArrayType with different element types") {
// Int → String
// Float → Int
// String → Int
// Decimal → Int
// Boolean → String
}
2. Document Map casting limitations:
- Add ignored/commented tests showing map casting is unsupported
- Or implement map casting if Spark supports it
3. Add null handling tests for complex types:
test("cast ArrayType with nulls") {
// [null, 1, 2] as array<string>
// null array value
}
Implementation Gaps
From the code analysis, these casts are implemented in Rust but lack Scala
tests:
- Array[T1] → Array[T2] (recursive element casting)
- Null → Array[T]
These casts are completely missing implementation:
- Anything involving MapType
Test Coverage Metrics
Current complex type cast coverage:
- Struct casts: ~80% (good coverage)
- Array casts: ~20% (only Array→String tested)
- Map casts: 0% (no implementation or tests)
Suggested Test File Structure
Consider organizing tests as:
// In CometCastSuite.scala
// ARRAY CASTS
test("cast ArrayType to StringType") { ... } // EXISTS
test("cast ArrayType to ArrayType - element type changes") { ... } //
MISSING
test("cast ArrayType to ArrayType - nested arrays") { ... } // MISSING
test("cast ArrayType with null elements") { ... } // MISSING
// STRUCT CASTS
test("cast StructType to StringType") { ... } // EXISTS
test("cast StructType to StructType") { ... } // EXISTS
test("cast StructType with ArrayType fields") { ... } // MISSING
test("cast StructType with null fields") { ... } // MISSING
// MAP CASTS
ignore("cast MapType to StringType") { ... } // MISSING - not implemented
ignore("cast MapType to MapType") { ... } // MISSING - not implemented
This analysis should help you prioritize which tests to add to improve
coverage of complex type casting operations.
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]