ccciudatu edited a comment on pull request #13428: URL: https://github.com/apache/beam/pull/13428#issuecomment-743118061
> LGTM, thank you @ccciudata! This looks like a very useful contribution and its well tested too :) I have a few minor asks > > Just curious, how are you expecting this to be used? Will it be used to read/write Thrift messages over other IOs like PubSub and Kafka? > > The reason I ask is that this will make the third SchemaProvider that can be used to to encode/decode messages and map them to Beam schemas (Avro and Protobuf are the others). So its probably about time we think about some abstraction to make these easily usable with PubsubIO/KafkaIO. @TheNeuralBit I think I only now got the true meaning of your question. *(disclaimer: I'm pretty new to Beam and I had to start with this "unfit" use case)*. My current approach assumes no integration between `ThriftCoder` and `ThriftSchema`, so one is supposed to include those separately (at least that is what I'm doing in my PoC): I'm using Kafka native (de)serializers to get Thrift data from/to the topic using the Kafka client alone, register the `ThriftCoder` for those key/value types and then (with a clumsy/redundant `PCollection.setSchema()` call) force my `ThriftSchema` to be picked up for row conversion, as the mere registration does not seem to work. I guess what you're saying is that the right approach would be to make use of some `BeamKafkaThriftTable` extending `BeamKafkaTable` and just read/write bytes from/to Kafka. This makes total sense and is a great improvement over what I have so far. So do expect a follow-up PR, unless you think this schema provider alone does not really deserve to be merged by itself. I will also have a closer look at the current abstractions and file Jiras for potential improvements if I'll be able to come up with any. P.S. as a side joke: you misspelled my name in a way that also changes my gender (at least in my native language), but it sounds so funny that I'm thinking of changing my username to that. :) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
