Hi Community,

The DATETIME field in Beam Schema/Row is implemented by Joda's Datetime
(see Row.java#L611
<https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java#L611>
 and Row.java#L169
<https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java#L169>).
Joda's Datetime is limited to the precision of millisecond. It has good
enough precision to represent timestamp of event time, but it is not enough
for the real "time" data. For the "time" type data, we probably need to
support even up to the precision of nanosecond.

Unfortunately, Joda decided to keep the precision of millisecond:
https://github.com/JodaOrg/joda-time/issues/139.

If we want to support the precision of nanosecond, we could have two
options:

Option one: utilize current FieldType's metadata field
<https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java#L421>,
such that we could set something into meta data and Row could check the
metadata to decide what's saved in DATETIME field: Joda's Datetime or an
implementation that supports nanosecond.

Option two: have another field (maybe called TIMESTAMP field?), to have an
implementation to support higher precision of time.

What do you think about the need of higher precision for time type and
which option is preferred?

-Rui

Reply via email to