Abacn commented on code in PR #29923:
URL: https://github.com/apache/beam/pull/29923#discussion_r1445317256
##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java:
##########
@@ -3575,11 +3573,22 @@ private <DestinationT> WriteResult continueExpandTyped(
!getPropagateSuccessfulStorageApiWrites(),
"withPropagateSuccessfulStorageApiWrites only supported when using
storage api writes.");
- // Batch load jobs currently support JSON data insertion only with CSV
files
+ // Beam does not yet support Batch load jobs with Avro files
if (getJsonSchema() != null && getJsonSchema().isAccessible()) {
JsonElement schema = JsonParser.parseString(getJsonSchema().get());
- if (!schema.getAsJsonObject().keySet().isEmpty()) {
- validateNoJsonTypeInSchema(schema);
+ if (!schema.getAsJsonObject().keySet().isEmpty() &&
hasJsonTypeInSchema(schema)) {
+ if (rowWriterFactory.getOutputType() == OutputType.JsonTableRow) {
+ LOG.warn(
+ "Found JSON type in TableSchema for 'FILE_LOADS' write
method. \n"
+ + "Make sure the TableSchema field is a parsed JSON to
ensure the read as a "
+ + "JSON type. Otherwise it will read as a raw (escaped)
string.");
Review Comment:
Yes. This is documented behavior of BQ FILE_LOAD API
e.g., instead of "someColumn": "[1]", it needs to be "someColumn": [1].
Otherwise it will write as a single string (string is also valid json object)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]