ffernandez92 commented on PR #29368:
URL: https://github.com/apache/beam/pull/29368#issuecomment-1804353900
I believe some other Go PR broke the Python Docker tests. @brucearctor ,
this PR enables Kafka Beam YAML to read and write proto.
Reading is 'easy' because the user has to provide the file descriptor and
the message name so we can build the Row from the bytes.
Writing was a bit more challenging. I initially had this method to write to
proto by just providing the Row schema:
```
public static SerializableFunction<Row, byte[]> getRowToProtoBytes() {
return new SimpleFunction<Row, byte[]>() {
@Override
public byte[] apply(Row input) {
SchemaApi.Row rowProto = SchemaTranslation.rowToProto(input);
return rowProto.toByteArray();
}
};
}
```
However, this generates a SchemaApi.Row proto. This implies that if another
system intends to read these protos, one must construct the proto using the
SchemaApi file descriptor, etc. This approach doesn't make much sense, as we
don't want users to remember this as the output.
As a result, I am also requesting the file descriptor and message name for
the output. This enables other systems to access the output proto without
needing to remember intricate details. While this description may be dense, I
believe it makes sense. I tested it with Dataflow (both read and write), and it
worked fine.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]