[ 
https://issues.apache.org/jira/browse/BEAM-9051?focusedWorklogId=371271&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-371271
 ]

ASF GitHub Bot logged work on BEAM-9051:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 14/Jan/20 01:47
            Start Date: 14/Jan/20 01:47
    Worklog Time Spent: 10m 
      Work Description: TheNeuralBit commented on pull request #10540: 
[BEAM-9051] BigQueryUtils toBeamRow support the Avro timestamp-millis logical 
type
URL: https://github.com/apache/beam/pull/10540#discussion_r366114687
 
 

 ##########
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java
 ##########
 @@ -565,21 +611,25 @@ public static Object convertAvroFormat(
       case BOOLEAN:
         return convertAvroPrimitiveTypes(beamFieldTypeName, avroValue);
       case DATETIME:
-        // Expecting value in microseconds.
-        switch (options.getTruncateTimestamps()) {
-          case TRUNCATE:
-            return truncateToMillis(avroValue);
-          case REJECT:
-            return safeToMillis(avroValue);
-          default:
-            throw new IllegalArgumentException(
-                String.format(
-                    "Unknown timestamp truncation option: %s", 
options.getTruncateTimestamps()));
+        if (avroFieldSchema != null
+            && avroFieldSchema.getLogicalType() == 
org.apache.avro.LogicalTypes.timestampMillis()) {
+          return new Instant((long) avroValue);
 
 Review comment:
   I think it would be better if we could map timestamp micros and timestamp 
millis to different Beam types, and then we could differentiate them without 
referencing the source avro schema. Ideally the Beam schema would leave no 
ambiguity.
   
   @alexvanboxel has proposed adding a nanosecond instant logical type in 
https://github.com/apache/beam/pull/10486. It seems we need logical types for 
millis and micros instants in that same vein.
   
   To me this seems like a good argument for deprecating the DATETIME primitive 
in favor of millis/micros/nanos logical types. It would provide much more 
clarity in situations like this.
   
   cc @reuvenlax 
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 371271)
    Time Spent: 1h 20m  (was: 1h 10m)

> BigQueryUtils toBeamRow to support GenericRecords with millisecond precision 
> timestamps
> ---------------------------------------------------------------------------------------
>
>                 Key: BEAM-9051
>                 URL: https://issues.apache.org/jira/browse/BEAM-9051
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-gcp
>    Affects Versions: 2.16.0
>            Reporter: Ryan Berti
>            Assignee: Ryan Berti
>            Priority: Minor
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Currently, BigQueryUtils assumes all Avro GenericRecords which utilize 
> timestamp fields include timestamp data in microseconds when converting from 
> GenericRecords to Beam Rows 
> ([https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java#L568]).
>  The implementation should also support timestamp data in milliseconds; this 
> should be feasible by referencing the LogicalType associated with the avro 
> field to determine if the underlying value is represented in millis or micros.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to