Sayat Satybaldiyev created BEAM-12375:
-----------------------------------------
Summary: ReadFromBigQuery doesn't support DATETIME type
Key: BEAM-12375
URL: https://issues.apache.org/jira/browse/BEAM-12375
Project: Beam
Issue Type: Bug
Components: io-go-gcp
Affects Versions: 2.29.0
Reporter: Sayat Satybaldiyev
ReadFromBigQuery with Avro export by documentation should produce a python
native object i.e. datetime.datetime.
> the BigQuery types for DATE, DATETIME, TIME, and TIMESTAMP will be exported
>as strings. This behavior is consistent with BigQuerySource. When using Avro
>exports, these fields will be exported as native Python types (datetime.date,
>datetime.datetime, datetime.datetime, and datetime.datetime respectively)
However, in practice this doesn't happen as BigQuery doesn't have a type that
maps to Avro logical type.
> *Note:* There is no logical type that directly corresponds to {{DATETIME}},
>and BigQuery doesn't support any direct conversion from an Avro type into a
>{{DATETIME}} field.
I've also checked manually the AVRO export file that get exported to GCS. The
logical type in Avro schema for BQ is `datetime`.
{"name":"datetime_col","type":["null",\{"type":"string","logicalType":"datetime"}]}
Avro spec doesn't support `datetime` logical type[3] nor FastAvro library[4]
that is used underneath to read the avro files.
Resources:
[1]
[https://beam.apache.org/releases/pydoc/2.29.0/apache_beam.io.gcp.bigquery.html#apache_beam.io.gcp.bigquery.ReadFromBigQuery]
[2]
[https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-avro#logical_types]
[3]
https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types|https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types]
[4] https://fastavro.readthedocs.io/en/latest/logical_types.html
--
This message was sent by Atlassian Jira
(v8.3.4#803005)