[ 
https://issues.apache.org/jira/browse/PARQUET-968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16415757#comment-16415757
 ] 

ASF GitHub Bot commented on PARQUET-968:
----------------------------------------

BenoitHanotte commented on a change in pull request #411: PARQUET-968 Add 
Hive/Presto support in ProtoParquet
URL: https://github.com/apache/parquet-mr/pull/411#discussion_r177454755
 
 

 ##########
 File path: 
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoMessageConverter.java
 ##########
 @@ -345,4 +351,121 @@ public void addBinary(Binary binary) {
     }
 
   }
+
+  /**
+   * This class unwraps the additional LIST wrapper and makes it possible to 
read the underlying data and then convert
+   * it to protobuf.
+   * <p>
+   * Consider the following protobuf schema:
+   * message SimpleList {
+   *   repeated int64 first_array = 1;
+   * }
+   * <p>
+   * A LIST wrapper is created in parquet for the above mentioned protobuf 
schema:
+   * message SimpleList {
+   *   required group first_array (LIST) = 1 {
+   *     repeated int32 element;
+   *   }
+   * }
+   * <p>
+   * The LIST wrappers are used by 3rd party tools, such as Hive, to read 
parquet arrays. The wrapper contains
+   * one only one field: either a primitive field (like in the example above, 
where we have an array of ints) or
+   * another group (array of messages).
+   */
+  final class ListConverter extends GroupConverter {
+    private final Converter converter;
+    private final boolean listOfMessage;
+
+    public ListConverter(Message.Builder parentBuilder, 
Descriptors.FieldDescriptor fieldDescriptor, Type parquetType) {
+      OriginalType originalType = parquetType.getOriginalType();
+      if (originalType != OriginalType.LIST) {
+        throw new ParquetDecodingException("Expected LIST wrapper. Found: " + 
originalType + " instead.");
+      }
+
+      listOfMessage = fieldDescriptor.getJavaType() == JavaType.MESSAGE;
 
 Review comment:
   done in https://github.com/costimuraru/parquet-mr/pull/2

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Hive/Presto support in ProtoParquet
> ---------------------------------------
>
>                 Key: PARQUET-968
>                 URL: https://issues.apache.org/jira/browse/PARQUET-968
>             Project: Parquet
>          Issue Type: Task
>            Reporter: Constantin Muraru
>            Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to