rdblue commented on a change in pull request #3912:
URL: https://github.com/apache/iceberg/pull/3912#discussion_r786301646
##########
File path:
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveSchemaConverter.java
##########
@@ -50,21 +52,27 @@ private HiveSchemaConverter(boolean autoConvert) {
this.id = 0;
}
- static Schema convert(List<String> names, List<TypeInfo> typeInfos,
List<String> comments, boolean autoConvert) {
+ static Schema convert(List<String> names, List<TypeInfo> typeInfos,
List<String> comments, boolean autoConvert,
+ Set<Integer> identifierFieldSet) {
HiveSchemaConverter converter = new HiveSchemaConverter(autoConvert);
- return new Schema(converter.convertInternal(names, typeInfos, comments));
+ return new Schema(converter.convertInternal(names, typeInfos, comments,
identifierFieldSet), identifierFieldSet);
}
static Type convert(TypeInfo typeInfo, boolean autoConvert) {
HiveSchemaConverter converter = new HiveSchemaConverter(autoConvert);
return converter.convertType(typeInfo);
}
- List<Types.NestedField> convertInternal(List<String> names, List<TypeInfo>
typeInfos, List<String> comments) {
+ List<Types.NestedField> convertInternal(List<String> names, List<TypeInfo>
typeInfos, List<String> comments,
+ Set<Integer> identifierFieldSet) {
List<Types.NestedField> result =
Lists.newArrayListWithExpectedSize(names.size());
for (int i = 0; i < names.size(); ++i) {
- result.add(Types.NestedField.optional(id++, names.get(i),
convertType(typeInfos.get(i)),
- (comments.isEmpty() || i >= comments.size()) ? null :
comments.get(i)));
+ String doc = (comments.isEmpty() || i >= comments.size()) ? null :
comments.get(i);
+ if (identifierFieldSet.contains(i)) {
+ result.add(Types.NestedField.required(id++, names.get(i),
convertType(typeInfos.get(i)), doc));
Review comment:
Iceberg code does not use the return value of `++` expressions because
it tends to be confusing.
Also, the identifier handling looks incorrect. The IDs used to create fields
here determine the set of field IDs passed as the identifier field set. You
can't have a set of IDs before those IDs are assigned. I suspect that this set
should be a set of column names.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]