Sayat Satybaldiyev created BEAM-12375:
-----------------------------------------

             Summary: ReadFromBigQuery doesn't support DATETIME type
                 Key: BEAM-12375
                 URL: https://issues.apache.org/jira/browse/BEAM-12375
             Project: Beam
          Issue Type: Bug
          Components: io-go-gcp
    Affects Versions: 2.29.0
            Reporter: Sayat Satybaldiyev


ReadFromBigQuery with Avro export by documentation should produce a python 
native object i.e. datetime.datetime. 

> the BigQuery types for DATE, DATETIME, TIME, and TIMESTAMP will be exported 
>as strings. This behavior is consistent with BigQuerySource. When using Avro 
>exports, these fields will be exported as native Python types (datetime.date, 
>datetime.datetime, datetime.datetime, and datetime.datetime respectively)

 

However, in practice this doesn't happen as BigQuery doesn't have a type that 
maps to Avro logical type.

 

> *Note:* There is no logical type that directly corresponds to {{DATETIME}}, 
>and BigQuery doesn't support any direct conversion from an Avro type into a 
>{{DATETIME}} field.

 

I've also checked manually the AVRO export file that get exported to GCS. The 
logical type in Avro schema for BQ is `datetime`.

{"name":"datetime_col","type":["null",\{"type":"string","logicalType":"datetime"}]}

 

Avro spec doesn't support `datetime` logical type[3] nor FastAvro library[4] 
that is used underneath to read the avro files.

 

Resources:

[1] 
[https://beam.apache.org/releases/pydoc/2.29.0/apache_beam.io.gcp.bigquery.html#apache_beam.io.gcp.bigquery.ReadFromBigQuery]

[2] 
[https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-avro#logical_types]

[3] 
https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types|https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types]

[4] https://fastavro.readthedocs.io/en/latest/logical_types.html

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to