[
https://issues.apache.org/jira/browse/BEAM-13959?focusedWorklogId=728522&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-728522
]
ASF GitHub Bot logged work on BEAM-13959:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 16/Feb/22 21:04
Start Date: 16/Feb/22 21:04
Worklog Time Spent: 10m
Work Description: chamikaramj commented on a change in pull request
#16872:
URL: https://github.com/apache/beam/pull/16872#discussion_r808450165
##########
File path:
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/TableRowToStorageApiProtoTest.java
##########
@@ -282,28 +487,36 @@ public void testNestedFromTableSchema() {
.put("arrayvalue", ImmutableList.of("hello", "goodbye"))
.build();
- private void assertBaseRecord(DynamicMessage msg) {
+ private void assertBaseRecord(DynamicMessage msg, boolean withF) {
Map<String, Object> recordFields =
msg.getAllFields().entrySet().stream()
.collect(
Collectors.toMap(entry -> entry.getKey().getName(), entry ->
entry.getValue()));
- assertEquals(BASE_ROW_EXPECTED_PROTO_VALUES, recordFields);
+ assertEquals(
+ withF ? BASE_ROW_EXPECTED_PROTO_VALUES :
BASE_ROW_NO_F_EXPECTED_PROTO_VALUES, recordFields);
}
@Test
public void testMessageFromTableRow() throws Exception {
TableRow tableRow =
- new TableRow().set("nestedValue1", BASE_TABLE_ROW).set("nestedValue2",
BASE_TABLE_ROW);
+ new TableRow()
+ .set("nestedValue1", BASE_TABLE_ROW)
Review comment:
Do we also support fields named "f" at top level (not just nested or
repeated fields) after this change ? If let's add a test for that as well. If
not let's clarify in the documentation.
##########
File path:
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/TableRowToStorageApiProtoTest.java
##########
@@ -244,9 +393,46 @@ public void testNestedFromTableSchema() {
.collect(
Collectors.toMap(FieldDescriptorProto::getName,
FieldDescriptorProto::getType));
assertEquals(expectedBaseTypes, nestedTypes2);
+
+ assertEquals(Type.TYPE_MESSAGE, types.get("nestedvaluenof1"));
+ String nestedTypeNameNoF1 = typeNames.get("nestedvaluenof1");
+ Map<String, Type> nestedTypesNoF1 =
+ nestedTypes.get(nestedTypeNameNoF1).getFieldList().stream()
+ .collect(
+ Collectors.toMap(FieldDescriptorProto::getName,
FieldDescriptorProto::getType));
+ assertEquals(expectedBaseTypesNoF, nestedTypesNoF1);
+ assertEquals(Type.TYPE_MESSAGE, types.get("nestedvaluenof2"));
+ String nestedTypeNameNoF2 = typeNames.get("nestedvaluenof2");
+ Map<String, Type> nestedTypesNoF2 =
+ nestedTypes.get(nestedTypeNameNoF2).getFieldList().stream()
+ .collect(
+ Collectors.toMap(FieldDescriptorProto::getName,
FieldDescriptorProto::getType));
+ assertEquals(expectedBaseTypesNoF, nestedTypesNoF2);
}
private static final TableRow BASE_TABLE_ROW =
+ new TableRow()
+ .setF(
Review comment:
This is the API users should use for writing to fields named "f" ? We
should document this in Java docs ? Also, is this fixed for all write methods
or just Storage API ?
##########
File path:
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/TableRowToStorageApiProtoTest.java
##########
@@ -50,6 +52,31 @@
/** Unit tests for {@link
org.apache.beam.sdk.io.gcp.bigquery.TableRowToStorageApiProto}. */
public class TableRowToStorageApiProtoTest {
private static final TableSchema BASE_TABLE_SCHEMA =
+ new TableSchema()
+ .setFields(
+ ImmutableList.<TableFieldSchema>builder()
+ .add(new
TableFieldSchema().setType("STRING").setName("stringValue"))
+ .add(new TableFieldSchema().setType("STRING").setName("f"))
Review comment:
Let's document why we are special casing "f" for testing here or make
this a separate test.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 728522)
Time Spent: 20m (was: 10m)
> Unable to write to BigQuery tables with column named 'f'
> --------------------------------------------------------
>
> Key: BEAM-13959
> URL: https://issues.apache.org/jira/browse/BEAM-13959
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp
> Affects Versions: 2.36.0
> Reporter: Joel Weierman
> Assignee: Reuven Lax
> Priority: P1
> Time Spent: 20m
> Remaining Estimate: 0h
>
> When using the BigQuery Storage Write API through the Java Beam SDK (both the
> latest release 2.35.0 and 2.36.0-SNAPSHOT), there seems to be an issue when
> converting field Storage API Proto to columns named 'f'.
> Reproduction Steps: The "field" named 'f' is unable to be written to BigQuery
> with the error referenced below.
> [1]
> "name": "item3",
> "type": "RECORD",
> "mode": "NULLABLE",
> "fields": [
> {
> "name": "data",
> "mode": "NULLABLE",
> "type": "RECORD",
> "fields": [
> {
> "mode": "NULLABLE",
> "name": "a",
> "type": "FLOAT"
> },
> {
> "mode": "NULLABLE",
> "name": "b",
> "type": "FLOAT"
> },
> {
> "mode": "NULLABLE",
> "name": "c",
> "type": "FLOAT"
> },
> {
> "mode": "NULLABLE",
> "name": "d",
> "type": "FLOAT"
> },
> {
> "mode": "NULLABLE",
> "name": "e",
> "type": "FLOAT"
> },
> {
> "mode": "NULLABLE",
> "name": "f",
> "type": "FLOAT"
> }
> ]
> }
> ]
> [2]
> {
> ...
> "item3": {
> "data": {
> "a": 1.627424812511E12,
> "b": 3.0,
> "c": 3.0,
> "d": 530.0,
> "e": 675.0
> }
> },
> ...
> }
> The following error occurs: Exception in thread "main"
> org.apache.beam.sdk.Pipeline$PipelineExecutionException:
> java.lang.IllegalArgumentException: Can not set java.util.List field
> com.google.api.services.bigquery.model.TableRow.f to java.lang.Double at
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:373)
> at
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:341)
> at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:218) at
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67) at
> org.apache.beam.sdk.Pipeline.run(Pipeline.java:323) at
> org.apache.beam.sdk.Pipeline.run(Pipeline.java:309) at
> com.google.cloud.teleport.templates.PubSubToBigQuery.run(PubSubToBigQuery.java:342)
> at
> com.google.cloud.teleport.templates.PubSubToBigQuery.main(PubSubToBigQuery.java:223)
> Caused by: java.lang.IllegalArgumentException: Can not set java.util.List
> field com.google.api.services.bigquery.model.TableRow.f to java.lang.Double
> at
> sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(UnsafeFieldAccessorImpl.java:167)
> at
> sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(UnsafeFieldAccessorImpl.java:171)
> at
> sun.reflect.UnsafeObjectFieldAccessorImpl.set(UnsafeObjectFieldAccessorImpl.java:81)
> at java.lang.reflect.Field.set(Field.java:764) at
> com.google.api.client.util.FieldInfo.setFieldValue(FieldInfo.java:275) at
> com.google.api.client.util.FieldInfo.setValue(FieldInfo.java:231) at
> com.google.api.client.util.GenericData.set(GenericData.java:118) at
> com.google.api.client.json.GenericJson.set(GenericJson.java:91) at
> com.google.api.services.bigquery.model.TableRow.set(TableRow.java:64) at
> com.google.api.services.bigquery.model.TableRow.set(TableRow.java:29) at
> com.google.api.client.util.GenericData.putAll(GenericData.java:131) at
> org.apache.beam.sdk.io.gcp.bigquery.TableRowToStorageApiProto.toProtoValue(TableRowToStorageApiProto.java:206)
> at
> org.apache.beam.sdk.io.gcp.bigquery.TableRowToStorageApiProto.messageValueFromFieldValue(TableRowToStorageApiProto.java:175)
> at
> org.apache.beam.sdk.io.gcp.bigquery.TableRowToStorageApiProto.messageFromTableRow(TableRowToStorageApiProto.java:103)
> at
> org.apache.beam.sdk.io.gcp.bigquery.TableRowToStorageApiProto.toProtoValue(TableRowToStorageApiProto.java:207)
> at
> org.apache.beam.sdk.io.gcp.bigquery.TableRowToStorageApiProto.messageValueFromFieldValue(TableRowToStorageApiProto.java:175)
> at
> org.apache.beam.sdk.io.gcp.bigquery.TableRowToStorageApiProto.messageFromTableRow(TableRowToStorageApiProto.java:103)
> at
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiDynamicDestinationsTableRow$1.toMessage(StorageApiDynamicDestinationsTableRow.java:95)
> at
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiConvertMessages$ConvertMessagesDoFn.processElement(StorageApiConvertMessages.java:106)
> This error does not show up if I leave the write method to use Streaming
> Inserts.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)