[ 
https://issues.apache.org/jira/browse/PARQUET-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197206#comment-16197206
 ] 

Ryan Blue commented on PARQUET-1125:
------------------------------------

I'm not sure I understand why we would want a more general 16-byte type. I 
think that INT96 was a similar idea, but that ended up being abused and never 
used to store big integers. Are you thinking about an INT128 type or something 
else? Hash digests?

Also, we can have more than one 16-byte logical type. I think a UUID type is a 
good idea so we have better storage for something we see a lot, and so object 
models can translate between String UUIDs and the storage representation 
transparently. If we wanted to do something similar for hash digests, then we 
would probably want a type with different expectations anyway.

> Add UUID logical type
> ---------------------
>
>                 Key: PARQUET-1125
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1125
>             Project: Parquet
>          Issue Type: Task
>          Components: parquet-format
>            Reporter: Ryan Blue
>
> I think we should add a UUID logical type that is stored in a 16-byte fixed. 
> The common string representation is 36 bytes instead of the 16 required. 
> UUIDs are commonly used as unique identifiers, so it makes sense to have a 
> good support. A binary representation will reduce memory when writing or 
> building bloom filters and will reduce cycles needed to compare values.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to