github-actions[bot] commented on code in PR #65094:
URL: https://github.com/apache/doris/pull/65094#discussion_r3504304802
##########
fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/source/PaimonScanNode.java:
##########
@@ -228,9 +227,7 @@ private List<String> getOrderedPathPartitionKeys() {
&& !((PaimonSysExternalTable) externalTable).isDataTable()) {
return Collections.emptyList();
}
- return source.getPaimonTable().partitionKeys().stream()
- .map(key -> key.toLowerCase(Locale.ROOT))
- .collect(Collectors.toList());
Review Comment:
Returning the Paimon partition keys with their original case changes the
scan-node contract, but the existing FE unit tests still assert the old
lowercase behavior.
`PaimonScanNodeTest.testGetPathPartitionKeysReturnsTablePartitionKeys` mocks
`["Dt", "Region"]` and expects `["dt", "region"]`, and
`testSetPaimonParamsUsesOrderedPartitionKeys` still uses lowercase `dt`/`pt`
keys even though `setPaimonParams()` now checks
`partitionValues.containsKey(partitionKey)` with the preserved key. Please
update those expectations and seeded partition-value keys, otherwise the FE
unit suite will fail on this PR.
##########
fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalTable.java:
##########
@@ -346,7 +346,7 @@ public Optional<SchemaCacheValue> initSchema(SchemaCacheKey
key) {
Set<String> partitionColumnNames =
Sets.newHashSet(tableSchema.partitionKeys());
List<Column> partitionColumns = Lists.newArrayList();
for (DataField field : columns) {
- Column column = new Column(field.name().toLowerCase(),
+ Column column = new Column(field.name(),
Review Comment:
This preserves mixed-case Paimon field names in Doris `Column` objects, but
the scan projection paths still assume required column names are lower-case. In
`PaimonScanNode.getPaimonSplitFromAPI()`, the projection list is built from
`paimonTable.rowType().getFieldNames().stream().map(String::toLowerCase)...indexOf(slot.getColumn().getName())`;
after this change a slot for `mIxEd_COL` keeps the exact `Column` name, so it
is looked up in `["mixed_col", ...]`, returns `-1`, and gets filtered out
before `withProjection(projected)`. The JNI reader has the same boundary: BE
sends exact `SlotDescriptor::col_name()` values as `required_fields`, while the
Java scanner lowercases Paimon field names and then does exact lookup, which
will throw `RequiredField mIxEd_COL not found in schema`. Please make these
projection lookups case-insensitive or normalize both sides consistently, and
add coverage for selecting a mixed-case Paimon column.
##########
fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergMetadataOps.java:
##########
@@ -368,8 +368,8 @@ public boolean performCreateTable(CreateTableInfo
createTableInfo) throws UserEx
.map(col -> new StructField(col.getName(), col.getType(),
col.getComment(), col.isAllowNull()))
.collect(Collectors.toList());
StructType structType = new StructType(new ArrayList<>(collect));
- Type visit =
- DorisTypeVisitor.visit(structType, new
DorisTypeToIcebergType(structType));
+ List<String> rootFieldNames =
columns.stream().map(Column::getName).collect(Collectors.toList());
+ Type visit = DorisTypeVisitor.visit(structType, new
DorisTypeToIcebergType(structType, rootFieldNames));
Review Comment:
Preserving the Iceberg schema's original root column names means the
sort-order builder also needs the same case-insensitive name resolution that
was added for partition specs. Doris validates `sortOrderFields` against a
case-insensitive column map, but `buildSortOrder()` still passes
`sortField.getColumnName()` directly into `SortOrder.builderFor(schema)`, whose
builder binds names case-sensitively by default. A create like column
`mIxEd_COL` with `ORDER BY (`mixed_col`)` can pass Doris validation and then
fail to bind against the preserved Iceberg schema. Please resolve sort fields
through `schema.caseInsensitiveFindField()` before calling `asc`/`desc`, and
add a mixed-case sort-order test.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]