blackhogz commented on a change in pull request #15485:
URL: https://github.com/apache/beam/pull/15485#discussion_r718240426
##########
File path:
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java
##########
@@ -551,9 +548,9 @@ public static TableRow toTableRow(Row row) {
return toTableRow((Row) fieldValue);
case DATETIME:
- return ((Instant) fieldValue)
- .toDateTime(DateTimeZone.UTC)
- .toString(BIGQUERY_TIMESTAMP_PRINTER);
+ org.joda.time.Instant jodaInstant = (org.joda.time.Instant) fieldValue;
+ java.time.Instant javaInstant =
java.time.Instant.ofEpochMilli(jodaInstant.getMillis());
+ return BIGQUERY_TIMESTAMP_PRINTER.format(javaInstant);
Review comment:
Thanks @TheNeuralBit . I'm working together with @amuletxheart and also
looking to see if I can be of any help.
With the opt-in flag, would this below approach be a reasonable venue to
proceed?
- add a `BigQueryIO.Write#allowTruncatedTimestamps()` method for explicit
opt-in (i.e. default false)
- pass the value to `BigQueryUtils.toTableRow()` as another parameter at the
call site
[here](https://github.com/apache/beam/blob/0111cff88025f0dc783a0890078b769139c8ae36/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L2687),
and `BigQueryUtils.toTableSchema()`
[here](https://github.com/apache/beam/blob/0111cff88025f0dc783a0890078b769139c8ae36/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L2690)
- Update `BigQueryUtils.toTableSchema()` to only accept NanosInstant logical
type if the `allowTruncatedTimestamps` parameter passed in is true. This should
reject, i.e. error out, before any rows are processed, i.e. before
formatFunction is triggered. In fact, with this, I'm thinking we don't even
need to pass `allowTruncatedTimestamps` to `BigQueryUtils.toTableRow()` any
more.
Please let us know what do you think? @TheNeuralBit and other maintainers.
Thanks a lot!
##########
File path:
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java
##########
@@ -594,10 +591,15 @@ public static TableRow toTableRow(Row row) {
java.time.format.DateTimeFormatter localDateTimeFormatter =
(0 == localDateTime.getNano()) ? ISO_LOCAL_DATE_TIME :
BIGQUERY_DATETIME_FORMATTER;
return localDateTimeFormatter.format(localDateTime);
- } else if ("Enum".equals(identifier)) {
+ } else if (EnumerationType.IDENTIFIER.equals(identifier)) {
return fieldType
.getLogicalType(EnumerationType.class)
.toString((EnumerationType.Value) fieldValue);
+ } else if (NanosInstant.IDENTIFIER.equals(identifier)) {
+ if (fieldValue instanceof java.time.Instant) {
Review comment:
@amuletxheart I think we should align with other field types and just
perform the type coercion without the if clause here. This allows us to fail
loudly instead of silently dropping the field.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]