TheNeuralBit commented on a change in pull request #15485:
URL: https://github.com/apache/beam/pull/15485#discussion_r717829804
##########
File path:
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java
##########
@@ -551,9 +548,9 @@ public static TableRow toTableRow(Row row) {
return toTableRow((Row) fieldValue);
case DATETIME:
- return ((Instant) fieldValue)
- .toDateTime(DateTimeZone.UTC)
- .toString(BIGQUERY_TIMESTAMP_PRINTER);
+ org.joda.time.Instant jodaInstant = (org.joda.time.Instant) fieldValue;
+ java.time.Instant javaInstant =
java.time.Instant.ofEpochMilli(jodaInstant.getMillis());
+ return BIGQUERY_TIMESTAMP_PRINTER.format(javaInstant);
Review comment:
(FYI I edited your message, you said DATETIME/joda Instant was limited
to microseconds, but I think you meant milliseconds)
You're right this is an improvement over the existing solution (converting
NanosInstant to DATETIME with millisecond precision), because it gets the full
supported microsecond precision into BigQuery. But there is an advantage with
the existing solution: the user has to opt-in to the truncation, by converting
to DATETIME. I feel strongly that we shouldn't be truncating implicitly, and if
we're doing that elsewhere I'd consider it a bug. Perhaps another reviewer
would feel differently, but that is my position.
That being said, you make a fair point that adding an option may be more
work than you want to take on. Would it be a reasonable compromise to make this
PR use the "erroring" approach, and file a follow-up ticket to add an option
for truncation?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]