RustedBones commented on code in PR #32514:
URL: https://github.com/apache/beam/pull/32514#discussion_r1773126459
##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryStorageSourceBase.java:
##########
@@ -182,30 +178,18 @@ public List<BigQueryStorageStreamSource<T>> split(
LOG.info("Read session returned {} streams",
readSession.getStreamsList().size());
}
- Schema sessionSchema;
- if (readSession.getDataFormat() == DataFormat.ARROW) {
- org.apache.arrow.vector.types.pojo.Schema schema =
- ArrowConversion.arrowSchemaFromInput(
- readSession.getArrowSchema().getSerializedSchema().newInput());
- org.apache.beam.sdk.schemas.Schema beamSchema =
- ArrowConversion.ArrowSchemaTranslator.toBeamSchema(schema);
- sessionSchema = AvroUtils.toAvroSchema(beamSchema);
- } else if (readSession.getDataFormat() == DataFormat.AVRO) {
- sessionSchema = new
Schema.Parser().parse(readSession.getAvroSchema().getSchema());
- } else {
- throw new IllegalArgumentException(
- "data is not in a supported dataFormat: " +
readSession.getDataFormat());
+ // TODO: this is inconsistent with method above, where it can be null
+ Preconditions.checkStateNotNull(targetTable);
+ TableSchema tableSchema = targetTable.getSchema();
+ if (selectedFieldsProvider != null &&
selectedFieldsProvider.isAccessible()) {
+ tableSchema = BigQueryUtils.trimSchema(tableSchema,
selectedFieldsProvider.get());
Review Comment:
Yes, the operation here is correct. It is just for consistency/simplicity:
- We also trim the schema in the `BigQueryIO` where we do not have access to
a trimmed avro schema.
- when we direct read with `ARROW`, to get access to the avro schema we do
`arrow -> beam -> avro` schema conversions just for the sake of filtering known
fields.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]