[
https://issues.apache.org/jira/browse/BEAM-13990?focusedWorklogId=731364&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-731364
]
ASF GitHub Bot logged work on BEAM-13990:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 23/Feb/22 05:46
Start Date: 23/Feb/22 05:46
Worklog Time Spent: 10m
Work Description: liu-du commented on pull request #16926:
URL: https://github.com/apache/beam/pull/16926#issuecomment-1048466314
> BigQuery should be accepting string values here - formatting as an integer
should be a performance change only
I think it's a _correctness_ issue rather than performance change. BigQuery
Storage Write API _rejects_ date/timestamp encoded in string format. In this
commit
[b56823d1d213adf6ca5564ce1d244cc4ae8f0816](https://github.com/liu-du/beam/commit/b56823d1d213adf6ca5564ce1d244cc4ae8f0816),
I used existing conversion (use string value for date/timestamp columns) and
add an integration test to write date/timestamp using protobuf string values,
the error I get is:
`INVALID_ARGUMENT: The proto field mismatched with BigQuery field at
Da40f0078_2574_4708_b66f_79eb0e516c43.datevalue, the proto field type string,
BigQuery field type DATE Entity`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 731364)
Remaining Estimate: 119h 10m (was: 119h 20m)
Time Spent: 50m (was: 40m)
> BigQueryIO cannot write to DATE and TIMESTAMP columns when using Storage
> Write API
> -----------------------------------------------------------------------------------
>
> Key: BEAM-13990
> URL: https://issues.apache.org/jira/browse/BEAM-13990
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp
> Affects Versions: 2.36.0
> Reporter: Du Liu
> Priority: P1
> Original Estimate: 120h
> Time Spent: 50m
> Remaining Estimate: 119h 10m
>
> when using Storage Write API with BigQueryIO, DATE and TIMESTAMP values are
> currently converted to String type in protobuf message. This is incorrect,
> according to storage write api [documentation|#data_type_conversions],] DATE
> should be converted to int32 and TIMESTAMP should be converted to int64.
> Here's error message:
> INFO: Stream finished with error
> com.google.api.gax.rpc.InvalidArgumentException:
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: The proto field mismatched
> with BigQuery field at D6cbe536b_4dab_4292_8fda_ff2932dded49.datevalue, the
> proto field type string, BigQuery field type DATE Entity
> I have included an integration test here:
> [https://github.com/liu-du/beam/commit/b56823d1d213adf6ca5564ce1d244cc4ae8f0816]
>
> The problem is because DATE and TIMESTAMP are converted to String in protobuf
> message here:
> [https://github.com/apache/beam/blob/a78fec72d0d9198eef75144a7bdaf93ada5abf9b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableRowToStorageApiProto.java#L69]
>
> Storage Write API reject the request because it's expecting int32/int64
> values.
>
> I've opened a PR here: https://github.com/apache/beam/pull/16926
--
This message was sent by Atlassian Jira
(v8.20.1#820001)