szlta commented on a change in pull request #3411:
URL: https://github.com/apache/iceberg/pull/3411#discussion_r821845093
##########
File path: core/src/main/java/org/apache/iceberg/Partitioning.java
##########
@@ -210,7 +211,8 @@ public static StructType partitionType(Table table) {
}
Map<Integer, PartitionField> fieldMap = Maps.newHashMap();
- List<NestedField> structFields = Lists.newArrayList();
+ Map<Integer, Type> typeMap = Maps.newHashMap();
+ Map<Integer, String> nameMap = Maps.newHashMap();
Review comment:
I have run into similar issues, and I think this will help resolve the
type change of columns in older specs.
Another thing I have seen and is probably still a problem, is that this
method may return a column name multiple times. Consider the following:
Table schema: a int, b date, c date
spec0: year(b), a
spec1: a
spec2: year(b), year(c)
then the result is something like this: 1000: b_year int, 1001: a int, 1002:
b_year int, 1003: c_year int
Further down the line when we construct a Schema object, we will have a
failure due to b_year name being present in two fields (1000, 1002).
How should this case be handled? Maybe appending _r<fieldId> to each column
name?
cc: @RussellSpitzer , @aokolnychyi
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]