[
https://issues.apache.org/jira/browse/BEAM-11338?focusedWorklogId=523137&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523137
]
ASF GitHub Bot logged work on BEAM-11338:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 11/Dec/20 10:39
Start Date: 11/Dec/20 10:39
Worklog Time Spent: 10m
Work Description: ccciudatu commented on pull request #13428:
URL: https://github.com/apache/beam/pull/13428#issuecomment-743118061
> LGTM, thank you @ccciudata! This looks like a very useful contribution and
its well tested too :) I have a few minor asks
>
> Just curious, how are you expecting this to be used? Will it be used to
read/write Thrift messages over other IOs like PubSub and Kafka?
>
> The reason I ask is that this will make the third SchemaProvider that can
be used to to encode/decode messages and map them to Beam schemas (Avro and
Protobuf are the others). So its probably about time we think about some
abstraction to make these easily usable with PubsubIO/KafkaIO.
@TheNeuralBit I think I only now got the true meaning of your question.
*(disclaimer: I'm pretty new to Beam and I had to start with this "unfit" use
case)*.
My current approach assumes no integration between `ThriftCoder` and
`ThriftSchema`, so one is supposed to include those separately (at least that
is what I'm doing in my PoC): I'm using Kafka native (de)serializers to get
Thrift data from/to the topic using the Kafka client alone, register the
`ThriftCoder` for those key/value types and then (with a clumsy/redundant
`PCollection.setSchema()` call) force my `ThriftSchema` to be picked up for row
conversion, as the mere registration does not seem to work.
I guess what you're saying is that the right approach would be to make use
of some `BeamKafkaThriftTable` extending `BeamKafkaTable` and just read/write
bytes from/to Kafka. This makes total sense and is a great improvement over
what I have so far. So do expect a follow-up PR, unless you think this schema
provider alone does not really deserve to be merged my itself. I will also have
a closer look at the current abstractions and file Jiras for potential
improvements if I'll be able to come up with any.
P.S. as a side joke: you misspelled my name in a way that also changes my
gender (at least in my native language), but it sounds so funny that I'm
thinking of changing my username to that. :)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 523137)
Time Spent: 3h 40m (was: 3.5h)
> Beam schema for thrift data
> ---------------------------
>
> Key: BEAM-11338
> URL: https://issues.apache.org/jira/browse/BEAM-11338
> Project: Beam
> Issue Type: New Feature
> Components: io-java-utilities
> Reporter: Costi Ciudatu
> Priority: P2
> Labels: thrift
> Time Spent: 3h 40m
> Remaining Estimate: 0h
>
> Define a SchemaProvider that can handle thrift objects, the same way as
> ProtoMessageSchema handles protobuf.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)