[
https://issues.apache.org/jira/browse/BEAM-2969?focusedWorklogId=212717&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-212717
]
ASF GitHub Bot logged work on BEAM-2969:
----------------------------------------
Author: ASF GitHub Bot
Created on: 13/Mar/19 21:57
Start Date: 13/Mar/19 21:57
Worklog Time Spent: 10m
Work Description: udim commented on pull request #8046: [BEAM-2969]
Handle negative AVRO timestamps
URL: https://github.com/apache/beam/pull/8046#discussion_r265348172
##########
File path:
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryAvroUtils.java
##########
@@ -311,12 +304,10 @@ private static Object convertRequiredField(
verify(v instanceof Boolean, "Expected Boolean, got %s", v.getClass());
return v;
case "TIMESTAMP":
- // TIMESTAMP data types are represented as Avro LONG types. They are
converted back to
- // Strings with variable precision (up to six digits) to match the
JSON files exported by
- // BigQuery.
+ // TIMESTAMP data types are represented as Avro LONG types,
microseconds since the epoch.
+ // Values may be negative since BigQuery timestamps start at
0001-01-01 00:00:00 UTC.
verify(v instanceof Long, "Expected Long, got %s", v.getClass());
- double doubleValue = ((Long) v) / 1_000_000.0;
- return formatTimestamp(Double.toString(doubleValue));
+ return formatTimestamp((Long) v);
Review comment:
I had to cast that explicitly; `v` is of type `Object`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 212717)
Time Spent: 2.5h (was: 2h 20m)
> BigQueryIO fails when reading then writing timestamps before 1970
> -----------------------------------------------------------------
>
> Key: BEAM-2969
> URL: https://issues.apache.org/jira/browse/BEAM-2969
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp
> Reporter: Kevin Peterson
> Assignee: Udi Meiri
> Priority: Major
> Time Spent: 2.5h
> Remaining Estimate: 0h
>
> I have a batch pipeline which reads from BigQuery (via standard sql query),
> does a small transform, and writes the data back to BigQuery.
> This fails if any timestamps are present in the BQ data from before 1970:
> {{"message" : "JSON parsing error in row starting at position 0: Couldn't
> convert value to timestamp: Could not parse '1969-12-28 02:52:54.-484 UTC' as
> a timestamp. Required format is YYYY-MM-DD HH:MM[:SS[.SSSSSS]] Field:
> observed_timestamp; Value: 1969-12-28 02:52:54.-484 UTC",}}
> It appears the TableRow coder doesn't handle negative timestamps properly,
> using a negative number for the fractions of a second, which BQ considers
> invalid.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)