[ 
https://issues.apache.org/jira/browse/PARQUET-278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546168#comment-14546168
 ] 

Tianshuo Deng commented on PARQUET-278:
---------------------------------------

It's good to double check empty fields in the constructor of GroupType.
Even if we make the constructor non public, the fluent builder api still calls 
it.

Migrating to using the builder API could be in another separate PR.

Also not sure if

Types.buildMessage()
       .required(INT64).named("DocId")
      .optionalGroup()
      .repeated(INT64).named("Backward")
      .repeated(INT64).named("Forward")
      .named("Links")
      .repeatedGroup()
      .repeatedGroup()
      .required(BINARY).named("Code")
      .optional(BINARY).named("Country")
      .named("Language")
      .optional(BINARY).named("Url")
      .named("Name")
      .named("Document");

really looks better than the constructor API in terms of readability since 
parquet schema is a nested structure, but the fluentAPI needs manually 
indentation to reflect the nested structure. The constructor based API is 
simple and clear, does its job nicely which is constructing objects and IDE can 
indent it pretty well.

new MessageType("Document",
          new PrimitiveType(REQUIRED, INT64, "DocId"),
          new GroupType(OPTIONAL, "Links",
              new PrimitiveType(REPEATED, INT64, "Backward"),
              new PrimitiveType(REPEATED, INT64, "Forward")
              ),
          new GroupType(REPEATED, "Name",
              new GroupType(REPEATED, "Language",
                  new PrimitiveType(REQUIRED, BINARY, "Code"),
                  new PrimitiveType(OPTIONAL, BINARY, "Country")),
              new PrimitiveType(OPTIONAL, BINARY, "Url")));

> enforce non empty group on MessageType level
> --------------------------------------------
>
>                 Key: PARQUET-278
>                 URL: https://issues.apache.org/jira/browse/PARQUET-278
>             Project: Parquet
>          Issue Type: Improvement
>            Reporter: Tianshuo Deng
>
> As columnar format, parquet currently does not support empty struct/group 
> without leaves. We should throw when constructing an empty GroupType to give 
> a clear message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to