DanielCarter-stack commented on PR #10436:
URL: https://github.com/apache/seatunnel/pull/10436#issuecomment-3834454057
<!-- code-pr-reviewer -->
<!-- cpr:pr_reply_v2_parts {"group": "apache/seatunnel#10436", "part": 1,
"total": 1} -->
### Issue 1: Insufficient Test Coverage (MINOR)
**Location**:
`seatunnel-api/src/test/java/org/apache/seatunnel/api/table/catalog/SeaTunnelDataTypeConvertorUtilTest.java:313-324`
**Related Context**:
- Test class: `SeaTunnelDataTypeConvertorUtilTest`
- Method under test: `SeaTunnelDataTypeConvertorUtil.parseArrayType()`
(private method)
- Caller: `SeaTunnelDataTypeConvertorUtil.deserializeSeaTunnelDataType()`
**Problem Description**:
The newly added test case `testAllSupportedTypes` only tests
`array<array<int>>` (two-dimensional array), but does not verify:
1. Three-layer or deeper nesting: `array<array<array<int>>>`
2. Combinations of Array with other complex types: `array<map<string,
int>>`, `array<row<...>>`
Although the recursive logic theoretically supports arbitrary depth,
boundary testing is missing.
**Potential Risks**:
- Risk 1: If someone modifies `ArrayType.of()` or `getGenericType()` in the
future, deep nesting may be broken
- Risk 2: The type system's consistency assumptions are not explicitly
verified in tests
**Impact Scope**:
- Direct impact: `testAllSupportedTypes` test method
- Indirect impact: All Connectors that depend on nested type resolution (if
users use three or more layers of nesting)
- Affected area: Core API layer (low-frequency scenario)
**Severity**: MINOR
**Improvement Suggestion**:
```java
// Add at the end of the testAllSupportedTypes() method
// Test deep nested arrays (3+ levels)
ArrayType<?, ?> arrayOfArrayOfArrayType =
(ArrayType<?, ?>)
SeaTunnelDataTypeConvertorUtil.deserializeSeaTunnelDataType(
"c_array_array_array", "array<array<array<int>>>");
Assertions.assertEquals(ArrayType.INT_ARRAY_TYPE,
((ArrayType<?, ?>)
arrayOfArrayOfArrayType.getElementType()).getElementType());
// Test array with other complex types
SeaTunnelRowType rowInArray =
(SeaTunnelRowType)
SeaTunnelDataTypeConvertorUtil.deserializeSeaTunnelDataType(
"c_array_row", "array<{c1 = string, c2 = int}>");
Assertions.assertEquals(2, rowInArray.getFieldTypes().length);
```
**Rationale**:
1. Completeness: Verify recursive depth boundaries
2. Consistency: Ensure Array's nesting behavior is consistent with Map and
Row
3. Regression protection: Prevent future changes from breaking deep nesting
### Issue 2: Documentation Does Not Explicitly State Support for Nested
Arrays (MINOR)
**Location**: `docs/en/introduction/concepts/schema-feature.md:98`
**Related Context**:
- Documentation section: What type supported at now
- PR goal: Fix nested array support
- User group: All users using Schema configuration
**Problem Description**:
schema-feature.md line 98 states that array element types include "int
string boolean ... double", but does not explicitly state that arrays
themselves can be contained. Although line 109 states that map's value can be
"any type", which technically includes arrays, users may be misled by the
explicit list of array element types.
**Potential Risks**:
- Risk 1: Users may think arrays can only contain basic types and dare not
use nested arrays
- Risk 2: Inconsistency between documentation and code implementation
reduces credibility
**Impact Scope**:
- Direct impact: schema-feature.md documentation
- Indirect impact: All users who need to use nested arrays
- Affected area: User documentation (medium impact)
**Severity**: MINOR
**Improvement Suggestion**:
```markdown
| array | `ValueType[]` | A array is a data type that represents a
collection of elements.
The element type can be any basic type (int, string, boolean, tinyint,
smallint, bigint, float, double) or another array for nested structures. |
```
And add in the example section:
```hocon
# Example of nested array
c_matrix = "array<array<int>>" # 2D array of integers
```
**Rationale**:
1. Accuracy: Documentation should reflect the actual capabilities of the code
2. Usability: Explicitly inform users that nested arrays can be used to
improve feature visibility
3. Consistency: Maintain consistency with the map type description (map
states value can be "any type")
### Issue 3: Missing Tests for `array<row<...>>` (MINOR)
**Location**:
`seatunnel-api/src/test/java/org/apache/seatunnel/api/table/catalog/SeaTunnelDataTypeConvertorUtilTest.java`
**Related Context**:
- Row type itself supports nesting (testConvertRowTypeNest has verified this)
- Array type now supports nesting (newly added in this PR)
- But the combination of Array nesting Row has not been tested
**Problem Description**:
Test cases cover nested Array and nested Map, but do not verify the scenario
of Array containing Row. Given that Row is a commonly used complex type in
SeaTunnel (corresponding to structs), this is a reasonable use case.
**Potential Risks**:
- Risk 1: The HOCON parser may have boundary issues when processing
`array<{a = int}>`
- Risk 2: `ArrayType.of()` may have type inference issues when the element
type is `SeaTunnelRowType`
**Impact Scope**:
- Direct impact: Test coverage
- Indirect impact: If users use `array<row<...>>`, they may encounter
untested bugs
- Affected area: Core API layer (medium-frequency scenario)
**Severity**: MINOR
**Improvement Suggestion**:
```java
// Add in the testAllSupportedTypes() method
// Test array with row type
SeaTunnelRowType rowTypeInArray =
(SeaTunnelRowType)
SeaTunnelDataTypeConvertorUtil.deserializeSeaTunnelDataType(
"c_array_row", "array<{name = string, age = int}>");
Assertions.assertEquals(2, rowTypeInArray.getFieldTypes().length);
Assertions.assertEquals("name", rowTypeInArray.getFieldName(0));
Assertions.assertEquals(BasicType.STRING_TYPE,
rowTypeInArray.getFieldType(0));
```
**Rationale**:
1. Practicality: Array<Row> is a common complex data structure (such as
array objects)
2. Completeness: Verify combinations of Array with all complex types
3. Risk control: HOCON parsing may have boundary issues in this scenario
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]