Brian, Daniel,
Really appreciate the feedback, thanks!

I'll go through the aforementioned PRs to get a better understanding of the
proto support effort.
If it's ok with you, I'd like to implement and add the protobuf tests to
the PubSubTableProviderIT.

Please let me know if that works for you.

Thanks again,
- Fernando Morales

On Thu, Mar 18, 2021 at 12:15 PM Daniel Collins <dpcoll...@google.com>
wrote:

> Also see https://github.com/apache/beam/pull/13838 which is related
>
> On Thu, Mar 18, 2021 at 3:06 PM Daniel Collins <dpcoll...@google.com>
> wrote:
>
>> > appears to be dead code
>>
>> Its not- its used as the format for the configuration row here
>> https://github.com/apache/beam/blob/3fc2ab10d9f5d5c5b65ecf94ce45861857206674/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubSchemaIOProvider.java#L145
>>
>> On Thu, Mar 18, 2021 at 2:53 PM Brian Hulette <bhule...@google.com>
>> wrote:
>>
>>> Ah the enum I linked in PubsubSchemaIOProvider appears to be dead code,
>>> I don't think it's referenced anywhere. So I think the implementation
>>> should be done, we'd just need to modify PubsubTableProviderIT to exercise
>>> the proto forma.
>>>
>>> On Thu, Mar 18, 2021 at 11:46 AM Daniel Collins <dpcoll...@google.com>
>>> wrote:
>>>
>>>> > because there are a couple places where Avro, JSON are still
>>>> hard-coded [2,3]
>>>>
>>>> This is not a blocker, its due to the fact that PubsubTableProvider is
>>>> just a wrapper for PubsubSchemaIOProvider. SchemaIOProvider requires you to
>>>> specify all possible options, TableProvider does not.
>>>>
>>>> On Thu, Mar 18, 2021 at 2:44 PM Brian Hulette <bhule...@google.com>
>>>> wrote:
>>>>
>>>>> Hi Fernando,
>>>>>
>>>>> Daniel Collins actually added the PayloadSerializerProvider concept
>>>>> very recently [1], which is why it looks like Piotr's code doesn't apply
>>>>> anymore. But the good news is I think that PR gets this task pretty close
>>>>> to completion. It doesn't look like the PR *quite* finished adding support
>>>>> for Proto to PubSubTableProvider though, because there are a couple places
>>>>> where Avro, JSON are still hard-coded [2,3]. For this task to be complete
>>>>> we should have tests of protobuf in PubSubTableProviderIT, which is
>>>>> parameterized by payload type [3].
>>>>>
>>>>> Regarding PubSubIO.readProtos: you're right to point out there's some
>>>>> overlap between the various PubSub readers/writers and the TableProviders.
>>>>> Ideally we'd define the logic for reading/writing Beam Rows with each IO 
>>>>> in
>>>>> a single place, but right now most of this logic lives in SQL's
>>>>> TableProviders, and in a few places it's duplicated into the IOs, as with
>>>>> readAvrosWithBeamSchema. For this task I think the right thing to do is 
>>>>> use
>>>>> PayloadSerializerProvider.
>>>>>
>>>>> [1] https://github.com/apache/beam/pull/13825
>>>>> [2]
>>>>> https://github.com/apache/beam/blob/3fc2ab10d9f5d5c5b65ecf94ce45861857206674/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubSchemaIOProvider.java#L111
>>>>> [3]
>>>>> https://github.com/apache/beam/blob/3fc2ab10d9f5d5c5b65ecf94ce45861857206674/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/meta/provider/pubsub/PubsubTableProviderIT.java#L108
>>>>>
>>>>> On Thu, Mar 18, 2021 at 10:32 AM Fernando Morales Martinez <
>>>>> fernando.mora...@wizeline.com> wrote:
>>>>>
>>>>>> Hi everyone,
>>>>>> I'm working on the WI mentioned in the subject: "Add Proto support to
>>>>>> Pubsub table provider" and I have a few questions . Sorry for the long 
>>>>>> mail!
>>>>>>
>>>>>>    - The only method in KafkaTableProvider that performs some logic
>>>>>>    is buildBeamSqlTable. However, when taking a look at the tests in
>>>>>>    KafkaTableProviderIT, the only one that is calling that method 
>>>>>> appears to
>>>>>>    be testFake2. But if I’m not mistaken, that test doesn’t perform any 
>>>>>> test
>>>>>>    for the proto case. The only one that tests the proto case is testFake
>>>>>>    test, but that is only creating the KafkaTableProvider and that’s it. 
>>>>>> I
>>>>>>    wanted to base the PubSubTableProvider on the KafkaTableProvider 
>>>>>> since that
>>>>>>    one supports Proto and looks like it accomplishes that support by 
>>>>>> using
>>>>>>    PayloadSerializer. Is that correct? Should I follow that path?
>>>>>>
>>>>>>
>>>>>>    - I wanted to base the new code on the commits by Piotr, but a
>>>>>>    lot of the code he submitted appears to have been removed, but I can’t
>>>>>>    grasp the reason. There are several commits referencing one another 
>>>>>> and the
>>>>>>    Proto support. Can you shed some light on that?
>>>>>>
>>>>>>
>>>>>>    - Back to the PubsubTableProvider: PubSubIO, which contains
>>>>>>    reader/writer methods for protos (although via ProtoCoder), is being 
>>>>>> used
>>>>>>    by the PubsubSchemaIOProvider class for other read/write purposes. 
>>>>>> Why are
>>>>>>    the protos reader/writer not used? Because of the ProtoCoder? Should 
>>>>>> we
>>>>>>    instead implement new readers/writers by formatting the payload 
>>>>>> directly,
>>>>>>    in a similar fashion to readAvrosWithBeamSchema?
>>>>>>
>>>>>> Thanks a lot for the help!
>>>>>>
>>>>>> - Fernando Morales
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *This email and its contents (including any attachments) are being
>>>>>> sent toyou on the condition of confidentiality and may be protected by
>>>>>> legalprivilege. Access to this email by anyone other than the intended
>>>>>> recipientis unauthorized. If you are not the intended recipient, please
>>>>>> immediatelynotify the sender by replying to this message and delete the
>>>>>> materialimmediately from your system. Any further use, dissemination,
>>>>>> distributionor reproduction of this email is strictly prohibited. 
>>>>>> Further,
>>>>>> norepresentation is made with respect to any content contained in this
>>>>>> email.*
>>>>>
>>>>>

-- 
*This email and its contents (including any attachments) are being sent to
you on the condition of confidentiality and may be protected by legal
privilege. Access to this email by anyone other than the intended recipient
is unauthorized. If you are not the intended recipient, please immediately
notify the sender by replying to this message and delete the material
immediately from your system. Any further use, dissemination, distribution
or reproduction of this email is strictly prohibited. Further, no
representation is made with respect to any content contained in this email.*

Reply via email to