+1 to more precision even to the nano level, probably via Reuven's
proposal of a different internal representation.
On Tue, Nov 6, 2018 at 9:19 AM Robert Bradshaw <rober...@google.com> wrote:
>
> +1 to offering more granular timestamps in general. I think it will be
> odd if setting the element timestamp from a row DATETIME field is
> lossy, so we should seriously consider upgrading that as well.
> On Tue, Nov 6, 2018 at 6:42 AM Charles Chen <c...@google.com> wrote:
> >
> > One related issue that came up before is that we (perhaps unnecessarily) 
> > restrict the precision of timestamps in the Python SDK to milliseconds 
> > because of legacy reasons related to the Java runner's use of Joda time.  
> > Perhaps Beam portability should natively use a more granular timestamp unit.
> >
> > On Mon, Nov 5, 2018 at 9:34 PM Rui Wang <ruw...@google.com> wrote:
> >>
> >> Thanks Reuven!
> >>
> >> I think Reuven gives the third option:
> >>
> >> Change internal representation of DATETIME field in Row. Still keep public 
> >> ReadableDateTime getDateTime(String fieldName) API to be compatible with 
> >> existing code. And I think we could add one more API to 
> >> getDataTimeNanosecond. This option is different from the option one 
> >> because option one actually maintains two implementation of time.
> >>
> >> -Rui
> >>
> >> On Mon, Nov 5, 2018 at 9:26 PM Reuven Lax <re...@google.com> wrote:
> >>>
> >>> I would vote that we change the internal representation of Row to 
> >>> something other than Joda. Java 8 times would give us at least 
> >>> microseconds, and if we want nanoseconds we could simply store it as a 
> >>> number.
> >>>
> >>> We should still keep accessor methods that return and take Joda objects, 
> >>> as the rest of Beam still depends on Joda.
> >>>
> >>> Reuven
> >>>
> >>> On Mon, Nov 5, 2018 at 9:21 PM Rui Wang <ruw...@google.com> wrote:
> >>>>
> >>>> Hi Community,
> >>>>
> >>>> The DATETIME field in Beam Schema/Row is implemented by Joda's Datetime 
> >>>> (see Row.java#L611 and Row.java#L169). Joda's Datetime is limited to the 
> >>>> precision of millisecond. It has good enough precision to represent 
> >>>> timestamp of event time, but it is not enough for the real "time" data. 
> >>>> For the "time" type data, we probably need to support even up to the 
> >>>> precision of nanosecond.
> >>>>
> >>>> Unfortunately, Joda decided to keep the precision of millisecond: 
> >>>> https://github.com/JodaOrg/joda-time/issues/139.
> >>>>
> >>>> If we want to support the precision of nanosecond, we could have two 
> >>>> options:
> >>>>
> >>>> Option one: utilize current FieldType's metadata field, such that we 
> >>>> could set something into meta data and Row could check the metadata to 
> >>>> decide what's saved in DATETIME field: Joda's Datetime or an 
> >>>> implementation that supports nanosecond.
> >>>>
> >>>> Option two: have another field (maybe called TIMESTAMP field?), to have 
> >>>> an implementation to support higher precision of time.
> >>>>
> >>>> What do you think about the need of higher precision for time type and 
> >>>> which option is preferred?
> >>>>
> >>>> -Rui

Reply via email to