[ 
https://issues.apache.org/jira/browse/AVRO-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14083646#comment-14083646
 ] 

Ryan Blue commented on AVRO-739:
--------------------------------

I agree we should standardize on a single epoch. I've been working lately on 
high-level types across a variety of storage formats and I think we need to 
keep the specifications as small as possible to ensure people can actually 
implement them. A spec doesn't help much if it ends up being partially 
implemented and we have to worry about what parts of it different components 
implemented.

I'm also in favor of simple names -- "date", "time" and so on. These names 
imply that they are the canonical way to store the type, which is exactly what 
we want for interoperability.

For specifics on what each type means, here is what we added to parquet:
* *date* - an int, the number of days from the unix epoch, 1 January 1970 (no 
time component)
* *time-millis* - an int, the number of milliseconds after midnight, 
00:00:00.000 (no date component)
* *timestamp-millis* - a long, the number of milliseconds from the unix epoch, 
1 January 1970 00:00:00.000 UTC (combined date and time)
* *interval* - 12-byte fixed, a 3-tuple of independent durations in months, 
days, milliseconds

There are more specifics on the [spec 
PR|https://github.com/apache/incubator-parquet-format/pull/5]. I would really 
like to see the Avro and Parquet communities adopt the same logical type 
encodings. That would be much easier for applications to implement, which means 
fewer bugs and better compatibility.

> Add Date/Time data types
> ------------------------
>
>                 Key: AVRO-739
>                 URL: https://issues.apache.org/jira/browse/AVRO-739
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>            Reporter: Jeff Hammerbacher
>         Attachments: AVRO-739.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to