ffernandez92 commented on PR #29368:
URL: https://github.com/apache/beam/pull/29368#issuecomment-1804353900

   I believe some other Go PR broke the Python Docker tests. @brucearctor , 
this PR enables Kafka Beam YAML to read and write proto.
   
   Reading is 'easy' because the user has to provide the file descriptor and 
the message name so we can build the Row from the bytes.
   
   Writing was a bit more challenging. I initially had this method to write to 
proto by just providing the Row schema:
   
   ```
   public static SerializableFunction<Row, byte[]> getRowToProtoBytes() {
       return new SimpleFunction<Row, byte[]>() {
         @Override
         public byte[] apply(Row input) {
           SchemaApi.Row rowProto = SchemaTranslation.rowToProto(input);
           return rowProto.toByteArray();
         }
       };
     }
   ```
   However, this generates a SchemaApi.Row proto. This implies that if another 
system intends to read these protos, one must construct the proto using the 
SchemaApi file descriptor, etc. This approach doesn't make much sense, as we 
don't want users to remember this as the output.
   
   As a result, I am also requesting the file descriptor and message name for 
the output. This enables other systems to access the output proto without 
needing to remember intricate details. While this description may be dense, I 
believe it makes sense. I tested it with Dataflow (both read and write), and it 
worked fine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to