[ 
https://issues.apache.org/jira/browse/PARQUET-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425789#comment-16425789
 ] 

Nandor Kollar commented on PARQUET-1253:
----------------------------------------

While working on the new logical type representation three questions came to 
mind:
* Despite there is a Thrift struct for UUID logical type in parquet-format, it 
is not included into the LogicalType union. Is this on purpose, or was omitted 
accidentally? How should parquet-mr handle those schemas, where UUID annotation 
is used, but there's no corresponding LogicalType mapping?
* Similar question with MAP_KEY_VALUE, but it is not implemented at all in the 
new representation. What should parquet-mr do with those schemas, which use it 
in the old representation?
* In parquet-format the comment for {{optional LogicalType logicalType}} says 
{{"The logical type of this SchemaElement; only valid for primitives."}} but 
I'm confused, because there's a Map and a List logical type, which  - as far as 
I know - makes sense only on groups. What was the intention of this comment? Do 
I miss anything?

[~rdblue] I can see that you worked on the new logical type representation, 
could you please help me to clarify these questions?

> Support for new logical type representation
> -------------------------------------------
>
>                 Key: PARQUET-1253
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1253
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-mr
>            Reporter: Nandor Kollar
>            Assignee: Nandor Kollar
>            Priority: Major
>
> Latest parquet-format 
> [introduced|https://github.com/apache/parquet-format/commit/863875e0be3237c6aa4ed71733d54c91a51deabe#diff-0f9d1b5347959e15259da7ba8f4b6252]
>  a new representation for logical types. As of now this is not yet supported 
> in parquet-mr, thus there's no way to use parametrized UTC normalized 
> timestamp data types. When reading and writing Parquet files, besides 
> 'converted_type' parquet-mr should use the new 'logicalType' field in 
> SchemaElement to tell the current logical type annotation. To maintain 
> backward compatibility, the semantic of converted_type shouldn't change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to