[
https://issues.apache.org/jira/browse/BEAM-13990?focusedWorklogId=732743&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-732743
]
ASF GitHub Bot logged work on BEAM-13990:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 24/Feb/22 22:34
Start Date: 24/Feb/22 22:34
Worklog Time Spent: 10m
Work Description: liu-du commented on a change in pull request #16926:
URL: https://github.com/apache/beam/pull/16926#discussion_r814325014
##########
File path:
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/StorageApiDynamicDestinationsTableRow.java
##########
@@ -23,24 +23,18 @@
import com.google.api.services.bigquery.model.TableSchema;
import com.google.protobuf.Descriptors.Descriptor;
import com.google.protobuf.Message;
-import java.time.Duration;
+import java.io.IOException;
import javax.annotation.Nullable;
import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write.CreateDisposition;
import org.apache.beam.sdk.io.gcp.bigquery.BigQueryServices.DatasetService;
import org.apache.beam.sdk.transforms.SerializableFunction;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.Cache;
-import
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.CacheBuilder;
@SuppressWarnings({"nullness"})
public class StorageApiDynamicDestinationsTableRow<T, DestinationT>
extends StorageApiDynamicDestinations<T, DestinationT> {
private final SerializableFunction<T, TableRow> formatFunction;
private final CreateDisposition createDisposition;
- // TODO: Is this cache needed? All callers of getMessageConverter are
already caching the resullt.
- private final Cache<DestinationT, Descriptor> destinationDescriptorCache =
-
CacheBuilder.newBuilder().expireAfterAccess(Duration.ofMinutes(15)).build();
Review comment:
I agree with the TODO comment that the cache is not needed here. The
MessageConverter return by getMessageConverter also need to remember the
equivalent TableSchema, but this cache only saves Descriptor, so I removed it
entirely
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 732743)
Remaining Estimate: 114h (was: 114h 10m)
Time Spent: 6h (was: 5h 50m)
> BigQueryIO cannot write to DATE and TIMESTAMP columns when using Storage
> Write API
> -----------------------------------------------------------------------------------
>
> Key: BEAM-13990
> URL: https://issues.apache.org/jira/browse/BEAM-13990
> Project: Beam
> Issue Type: Improvement
> Components: io-java-gcp
> Affects Versions: 2.36.0
> Reporter: Du Liu
> Assignee: Du Liu
> Priority: P2
> Original Estimate: 120h
> Time Spent: 6h
> Remaining Estimate: 114h
>
> when using Storage Write API with BigQueryIO, DATE and TIMESTAMP values are
> currently converted to String type in protobuf message. This is incorrect,
> according to storage write api [documentation|#data_type_conversions],] DATE
> should be converted to int32 and TIMESTAMP should be converted to int64.
> Here's error message:
> INFO: Stream finished with error
> com.google.api.gax.rpc.InvalidArgumentException:
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: The proto field mismatched
> with BigQuery field at D6cbe536b_4dab_4292_8fda_ff2932dded49.datevalue, the
> proto field type string, BigQuery field type DATE Entity
> I have included an integration test here:
> [https://github.com/liu-du/beam/commit/b56823d1d213adf6ca5564ce1d244cc4ae8f0816]
>
> The problem is because DATE and TIMESTAMP are converted to String in protobuf
> message here:
> [https://github.com/apache/beam/blob/a78fec72d0d9198eef75144a7bdaf93ada5abf9b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableRowToStorageApiProto.java#L69]
>
> Storage Write API reject the request because it's expecting int32/int64
> values.
>
> I've opened a PR here: https://github.com/apache/beam/pull/16926
--
This message was sent by Atlassian Jira
(v8.20.1#820001)