[
https://issues.apache.org/jira/browse/BEAM-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ismaël Mejía updated BEAM-7999:
-------------------------------
Status: Open (was: Triage Needed)
> BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly
> -----------------------------------------------------------------------
>
> Key: BEAM-7999
> URL: https://issues.apache.org/jira/browse/BEAM-7999
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp
> Affects Versions: 2.14.0, 2.15.0
> Reporter: Alex Van Boxel
> Assignee: Alex Van Boxel
> Priority: Major
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Using the new readTableRowsWithSchema to make a copy of a table (simple
> operation), parsing the timestamp in the table doesn't work as it assumes a
> Double value. BigQuery outputs a string like "2019-08-16 00:12:00.123456
> UTC". This isn't handled.
> *Reproducable:*
> with this table
> {code:java}
> INSERT `research.alex.in1` (row_id, f_int64, f_timestamp)
> VALUES
> (1, 1, '2019-08-16 00:12:00 UTC'),
> (2, 2, '2019-08-16 00:12:00.123 UTC'),
> (3, 3, '2019-08-16 00:12:00.123456 UTC')
> {code}
> do a copy operation:
> {code:java}
> pipeline
> .apply(
> BigQueryIO.readTableRowsWithSchema()
> .from("research:alex.in1")
> //.withMethod(BigQueryIO.TypedRead.Method.DIRECT_READ)
> )
> .apply(ParDo.of(new Inspect()))
> .apply(
> BigQueryIO.writeTableRows()
>
> .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
> .withMethod(BigQueryIO.Write.Method.FILE_LOADS)
> .useBeamSchema()
> .to("research:alex.out4"));
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.2#803003)