[ https://issues.apache.org/jira/browse/PARQUET-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197206#comment-16197206 ]
Ryan Blue commented on PARQUET-1125: ------------------------------------ I'm not sure I understand why we would want a more general 16-byte type. I think that INT96 was a similar idea, but that ended up being abused and never used to store big integers. Are you thinking about an INT128 type or something else? Hash digests? Also, we can have more than one 16-byte logical type. I think a UUID type is a good idea so we have better storage for something we see a lot, and so object models can translate between String UUIDs and the storage representation transparently. If we wanted to do something similar for hash digests, then we would probably want a type with different expectations anyway. > Add UUID logical type > --------------------- > > Key: PARQUET-1125 > URL: https://issues.apache.org/jira/browse/PARQUET-1125 > Project: Parquet > Issue Type: Task > Components: parquet-format > Reporter: Ryan Blue > > I think we should add a UUID logical type that is stored in a 16-byte fixed. > The common string representation is 36 bytes instead of the 16 required. > UUIDs are commonly used as unique identifiers, so it makes sense to have a > good support. A binary representation will reduce memory when writing or > building bloom filters and will reduce cycles needed to compare values. -- This message was sent by Atlassian JIRA (v6.4.14#64029)