Hi,

I also vote for the 3rd option (two new logical types:
‘local-timestamp-millis’ and ‘local-timestamp-micros’).

Could you please create a JIRA for this task and send a link to it to
this e-mail thread for everyone interested in the topic?

Thanks,

Zoltan

On Tue, Apr 23, 2019 at 1:49 PM Ryan Skraba <[email protected]> wrote:
>
> Hello!  I read the document with interest.  Very well-written and clean --
> I feel better equipped to explain the importance of the different flavours
> of date/time after reading it.
>
> I didn't go through the POC code in detail, but I did go through a bunch of
> our code to check how the proposed implementations would affect us (to
> provide a single, anecdotal data point).  We currently use Avro to
> represent hierachical data internally as it passes through a transformation
> pipeline running on a cluster.  We mostly rely on generic data.  The input
> or output might already be in Avro (file or binary message format), but it
> isn't necessary.  We do the schema inference and conversion on non-Avro
> when required.
>
> For us, it looks like both option#2 and option#3 should be more-or-less
> safe.  If we don't recognize a logical type, we'll just fall back on the
> underlying Avro type, and even propagate the unknown logical type down the
> pipeline if we can.
>
> Specifically, the bold proposal (option#2) for a new, unified logical type
> would mostly work without code modification on our part.  There's one or
> two places where we'd lose some helpful features where the semantic
> date/time type is taken into account, until we did the necessary rewrites.
> It wouldn't be a difficult task for us to bump to an Avro version that uses
> the new, unified logical type.
>
> Of course, the problem occurs when we're writing out data in Avro ... and
> the user has a next stage that doesn't understand the change.  Even if I
> appreciate the elegance of having a unified date/type logical type, it
> really seems like the more conservative third option (multiplying the
> number of logical types) is preferable.  Even if Avro ends up with a dozen
> logical types to describe the different flavours of date/time, this can
> eventually be unified in the language-specific API tools without breaking
> the schema specification.
>
> TL;DR: I read it, I appreciated it, I agree with your conclusions.
>
> Thanks again for the thorough and articulate work!  Ryan
>
>
>
> On Wed, Apr 17, 2019 at 9:44 AM Nandor Kollar <[email protected]>
> wrote:
>
> > Hi all,
> >
> > There is an ongoing effort to harmonize timestamp types for various popular
> > SQL engines for Hadoop (see details here
> > <
> > https://docs.google.com/document/d/1E-7miCh4qK6Mg54b-Dh5VOyhGX8V4xdMXKIHJL36a9U/edit#
> > >).
> > As part of this effort, on disk file formats should be able to support all
> > of these semantics. Avro timestamp logical type supports only one semantic:
> > UTC normalized. I put together a simple design doc an two POCs which
> > introduce additional local date/time semantics into Avro. Here is the
> > design doc:
> >
> > https://docs.google.com/document/d/1rLmb4-6G8LHBwHUU2P_8gE1o3lvMV0gSitnmiXXmlWY/edit?usp=sharing
> >
> > What are the thoughts on this? Please have a look at the POCs, and feel
> > free to comment the design doc!
> >
> > Thanks,
> > Nandor
> >

Reply via email to