Re: BigQuery TIMESTAMP and TimestampedValue()
Thanks Kenneth. Yes, fortunately it's *always* UTC, so I was able to solve it with t = event['ts'] # -[M]M-[D]D[( |T)[H]H:[M]M:[S]S[.DD]][UTC] dt = datetime.datetime.strptime(t, '%Y-%m-%d %H:%M:%S.%f %Z') yield beam.window.TimestampedValue(event, dt.timestamp()) On Wed, Jan 22, 2020 at 7:07 PM Kenneth Knowles wrote: > Ah, that's too bad. I wonder why they chose to put " UTC" on the end > instead of just a "Z". Other than that, the format is RFC3339 and the > iso8601 module does have the extension to use a space instead of a T to > separate the date and time. I tested and if you strip the " UTC" then > parsing succeeds. > > Since BigQuery TIMESTAMPS do not carry time zone information, it is safe > to ignore the time zone portion. The problem of course is if they > change/fix this it could break your code. > > Kenn > > On Mon, Jan 20, 2020 at 2:45 PM Sandy Walsh wrote: > >> [image: :wave:] Newb here for what will certainly be the first of many >> silly questions ... >> >> I'm working on a dataflow pipeline using python SDK (local runners >> currently). >> >> It's a bounded source from BigQuery. One column is a TIMESTAMP. I'm >> trying to assign the timestamp using beam.window.TimestampedValue() but >> the timestamp I'm getting back from BQ seems to be a string and not in >> RFC3339 format. >> >> The format is '2019-12-13 09:38:19.380224 UTC' ... which I could >> explicitly convert but I'd rather do that in the query. >> >> Any suggestions on how to get the timestamp back in format I can parse >> with iso8601.parse_date() or, ideally, just pass into TimestampedValue() >> without having to parse a string? >> >> Thanks >> >>
Re: BigQuery TIMESTAMP and TimestampedValue()
Ah, that's too bad. I wonder why they chose to put " UTC" on the end instead of just a "Z". Other than that, the format is RFC3339 and the iso8601 module does have the extension to use a space instead of a T to separate the date and time. I tested and if you strip the " UTC" then parsing succeeds. Since BigQuery TIMESTAMPS do not carry time zone information, it is safe to ignore the time zone portion. The problem of course is if they change/fix this it could break your code. Kenn On Mon, Jan 20, 2020 at 2:45 PM Sandy Walsh wrote: > [image: :wave:] Newb here for what will certainly be the first of many > silly questions ... > > I'm working on a dataflow pipeline using python SDK (local runners > currently). > > It's a bounded source from BigQuery. One column is a TIMESTAMP. I'm trying > to assign the timestamp using beam.window.TimestampedValue() but the > timestamp I'm getting back from BQ seems to be a string and not in RFC3339 > format. > > The format is '2019-12-13 09:38:19.380224 UTC' ... which I could > explicitly convert but I'd rather do that in the query. > > Any suggestions on how to get the timestamp back in format I can parse > with iso8601.parse_date() or, ideally, just pass into TimestampedValue() > without having to parse a string? > > Thanks > >
BigQuery TIMESTAMP and TimestampedValue()
[image: :wave:] Newb here for what will certainly be the first of many silly questions ... I'm working on a dataflow pipeline using python SDK (local runners currently). It's a bounded source from BigQuery. One column is a TIMESTAMP. I'm trying to assign the timestamp using beam.window.TimestampedValue() but the timestamp I'm getting back from BQ seems to be a string and not in RFC3339 format. The format is '2019-12-13 09:38:19.380224 UTC' ... which I could explicitly convert but I'd rather do that in the query. Any suggestions on how to get the timestamp back in format I can parse with iso8601.parse_date() or, ideally, just pass into TimestampedValue() without having to parse a string? Thanks