Re: BigQuery TIMESTAMP and TimestampedValue()

2020-01-23 Thread Sandy Walsh
Thanks Kenneth.

Yes, fortunately it's *always* UTC, so I was able to solve it with

t = event['ts']
# -[M]M-[D]D[( |T)[H]H:[M]M:[S]S[.DD]][UTC]
dt = datetime.datetime.strptime(t, '%Y-%m-%d %H:%M:%S.%f %Z')
yield beam.window.TimestampedValue(event, dt.timestamp())


On Wed, Jan 22, 2020 at 7:07 PM Kenneth Knowles  wrote:

> Ah, that's too bad. I wonder why they chose to put " UTC" on the end
> instead of just a "Z". Other than that, the format is RFC3339 and the
> iso8601 module does have the extension to use a space instead of a T to
> separate the date and time. I tested and if you strip the " UTC" then
> parsing succeeds.
>
> Since BigQuery TIMESTAMPS do not carry time zone information, it is safe
> to ignore the time zone portion. The problem of course is if they
> change/fix this it could break your code.
>
> Kenn
>
> On Mon, Jan 20, 2020 at 2:45 PM Sandy Walsh  wrote:
>
>> [image: :wave:] Newb here for what will certainly be the first of many
>> silly questions ...
>>
>> I'm working on a dataflow pipeline using python SDK (local runners
>> currently).
>>
>> It's a bounded source from BigQuery. One column is a TIMESTAMP. I'm
>> trying to assign the timestamp using beam.window.TimestampedValue() but
>> the timestamp I'm getting back from BQ seems to be a string and not in
>> RFC3339 format.
>>
>> The format is '2019-12-13 09:38:19.380224 UTC' ... which I could
>> explicitly convert but I'd rather do that in the query.
>>
>> Any suggestions on how to get the timestamp back in format I can parse
>> with iso8601.parse_date() or, ideally, just pass into TimestampedValue()
>> without having to parse a string?
>>
>> Thanks
>>
>>


Re: BigQuery TIMESTAMP and TimestampedValue()

2020-01-22 Thread Kenneth Knowles
Ah, that's too bad. I wonder why they chose to put " UTC" on the end
instead of just a "Z". Other than that, the format is RFC3339 and the
iso8601 module does have the extension to use a space instead of a T to
separate the date and time. I tested and if you strip the " UTC" then
parsing succeeds.

Since BigQuery TIMESTAMPS do not carry time zone information, it is safe to
ignore the time zone portion. The problem of course is if they change/fix
this it could break your code.

Kenn

On Mon, Jan 20, 2020 at 2:45 PM Sandy Walsh  wrote:

> [image: :wave:] Newb here for what will certainly be the first of many
> silly questions ...
>
> I'm working on a dataflow pipeline using python SDK (local runners
> currently).
>
> It's a bounded source from BigQuery. One column is a TIMESTAMP. I'm trying
> to assign the timestamp using beam.window.TimestampedValue() but the
> timestamp I'm getting back from BQ seems to be a string and not in RFC3339
> format.
>
> The format is '2019-12-13 09:38:19.380224 UTC' ... which I could
> explicitly convert but I'd rather do that in the query.
>
> Any suggestions on how to get the timestamp back in format I can parse
> with iso8601.parse_date() or, ideally, just pass into TimestampedValue()
> without having to parse a string?
>
> Thanks
>
>


BigQuery TIMESTAMP and TimestampedValue()

2020-01-20 Thread Sandy Walsh
[image: :wave:] Newb here for what will certainly be the first of many
silly questions ...

I'm working on a dataflow pipeline using python SDK (local runners
currently).

It's a bounded source from BigQuery. One column is a TIMESTAMP. I'm trying
to assign the timestamp using beam.window.TimestampedValue() but the
timestamp I'm getting back from BQ seems to be a string and not in RFC3339
format.

The format is '2019-12-13 09:38:19.380224 UTC' ... which I could explicitly
convert but I'd rather do that in the query.

Any suggestions on how to get the timestamp back in format I can parse with
iso8601.parse_date() or, ideally, just pass into TimestampedValue() without
having to parse a string?

Thanks