I would vote that we change the internal representation of Row to something
other than Joda. Java 8 times would give us at least microseconds, and if
we want nanoseconds we could simply store it as a number.

We should still keep accessor methods that return and take Joda objects, as
the rest of Beam still depends on Joda.

Reuven

On Mon, Nov 5, 2018 at 9:21 PM Rui Wang <ruw...@google.com> wrote:

> Hi Community,
>
> The DATETIME field in Beam Schema/Row is implemented by Joda's Datetime
> (see Row.java#L611
> <https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java#L611>
>  and Row.java#L169
> <https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java#L169>).
> Joda's Datetime is limited to the precision of millisecond. It has good
> enough precision to represent timestamp of event time, but it is not enough
> for the real "time" data. For the "time" type data, we probably need to
> support even up to the precision of nanosecond.
>
> Unfortunately, Joda decided to keep the precision of millisecond:
> https://github.com/JodaOrg/joda-time/issues/139.
>
> If we want to support the precision of nanosecond, we could have two
> options:
>
> Option one: utilize current FieldType's metadata field
> <https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java#L421>,
> such that we could set something into meta data and Row could check the
> metadata to decide what's saved in DATETIME field: Joda's Datetime or an
> implementation that supports nanosecond.
>
> Option two: have another field (maybe called TIMESTAMP field?), to have an
> implementation to support higher precision of time.
>
> What do you think about the need of higher precision for time type and
> which option is preferred?
>
> -Rui
>

Reply via email to