SinghAsDev commented on a change in pull request #3723:
URL: https://github.com/apache/iceberg/pull/3723#discussion_r768113767



##########
File path: 
parquet/src/main/java/org/apache/iceberg/parquet/ApplyNameMapping.java
##########
@@ -88,6 +96,31 @@ public Type primitive(PrimitiveType primitive) {
     return field == null ? primitive : primitive.withId(field.id());
   }
 
+  @Override
+  public void beforeElementField(Type element) {
+    super.beforeElementField(makeElement(element));
+  }
+
+  @Override
+  public void afterElementField(Type element) {
+    super.afterElementField(makeElement(element));
+  }
+
+  private Type makeElement(Type element) {
+    // List's element in 3-level lists can be named differently across 
different parquet writers.
+    // For example, hive names it "array_element", whereas new parquet writers 
names it as "element".
+    if (element.getName().equals("element") || element.isPrimitive()) {
+      return element;
+    }

Review comment:
       I tried a bit to remove the usage of "element", however there isn't a 
clean way to do so. The only way I could come up with is build a dummy list and 
then get the element out of it. I am not sure if that is any cleaner than the 
current approach. Below is the change I am referring to.
   ```
     private Type makeElement(Type element) {
       // List's element in 3-level lists can be named differently across 
different parquet writers.
       // For example, hive names it "array_element", whereas new parquet 
writers names it as "element".
       if (element.isPrimitive()) {
         return element;
       }
   
       Types.BaseListBuilder.GroupElementBuilder<GroupType, 
Types.ListBuilder<GroupType>> dummyBuilder = 
Types.list(Type.Repetition.OPTIONAL)
               .groupElement(element.getRepetition())
               .addFields(element.asGroupType().getFields().toArray(new 
Type[0]));
       if (element.getId() != null) {
         dummyBuilder.id(element.getId().intValue());
       }
       return dummyBuilder.named("dummy").getType(0).asGroupType().getType(0);
     }
   ```
   
   As such, I don't have a strong preference on it. @kbendick let me know if 
you like this one better and I will make the change.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to