+1, thank you! On Thu, Mar 18, 2021 at 12:28 PM Daniel Collins <dpcoll...@google.com> wrote:
> Sounds good to me, thanks! > > On Thu, Mar 18, 2021 at 3:27 PM Fernando Morales Martinez < > fernando.mora...@wizeline.com> wrote: > >> Brian, Daniel, >> Really appreciate the feedback, thanks! >> >> I'll go through the aforementioned PRs to get a better understanding of >> the proto support effort. >> If it's ok with you, I'd like to implement and add the protobuf tests to >> the PubSubTableProviderIT. >> >> Please let me know if that works for you. >> >> Thanks again, >> - Fernando Morales >> >> On Thu, Mar 18, 2021 at 12:15 PM Daniel Collins <dpcoll...@google.com> >> wrote: >> >>> Also see https://github.com/apache/beam/pull/13838 which is related >>> >>> On Thu, Mar 18, 2021 at 3:06 PM Daniel Collins <dpcoll...@google.com> >>> wrote: >>> >>>> > appears to be dead code >>>> >>>> Its not- its used as the format for the configuration row here >>>> https://github.com/apache/beam/blob/3fc2ab10d9f5d5c5b65ecf94ce45861857206674/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubSchemaIOProvider.java#L145 >>>> >>>> On Thu, Mar 18, 2021 at 2:53 PM Brian Hulette <bhule...@google.com> >>>> wrote: >>>> >>>>> Ah the enum I linked in PubsubSchemaIOProvider appears to be dead >>>>> code, I don't think it's referenced anywhere. So I think the >>>>> implementation >>>>> should be done, we'd just need to modify PubsubTableProviderIT to exercise >>>>> the proto forma. >>>>> >>>>> On Thu, Mar 18, 2021 at 11:46 AM Daniel Collins <dpcoll...@google.com> >>>>> wrote: >>>>> >>>>>> > because there are a couple places where Avro, JSON are still >>>>>> hard-coded [2,3] >>>>>> >>>>>> This is not a blocker, its due to the fact that PubsubTableProvider >>>>>> is just a wrapper for PubsubSchemaIOProvider. SchemaIOProvider requires >>>>>> you >>>>>> to specify all possible options, TableProvider does not. >>>>>> >>>>>> On Thu, Mar 18, 2021 at 2:44 PM Brian Hulette <bhule...@google.com> >>>>>> wrote: >>>>>> >>>>>>> Hi Fernando, >>>>>>> >>>>>>> Daniel Collins actually added the PayloadSerializerProvider concept >>>>>>> very recently [1], which is why it looks like Piotr's code doesn't apply >>>>>>> anymore. But the good news is I think that PR gets this task pretty >>>>>>> close >>>>>>> to completion. It doesn't look like the PR *quite* finished adding >>>>>>> support >>>>>>> for Proto to PubSubTableProvider though, because there are a couple >>>>>>> places >>>>>>> where Avro, JSON are still hard-coded [2,3]. For this task to be >>>>>>> complete >>>>>>> we should have tests of protobuf in PubSubTableProviderIT, which is >>>>>>> parameterized by payload type [3]. >>>>>>> >>>>>>> Regarding PubSubIO.readProtos: you're right to point out there's >>>>>>> some overlap between the various PubSub readers/writers and the >>>>>>> TableProviders. Ideally we'd define the logic for reading/writing Beam >>>>>>> Rows >>>>>>> with each IO in a single place, but right now most of this logic lives >>>>>>> in >>>>>>> SQL's TableProviders, and in a few places it's duplicated into the IOs, >>>>>>> as >>>>>>> with readAvrosWithBeamSchema. For this task I think the right thing to >>>>>>> do >>>>>>> is use PayloadSerializerProvider. >>>>>>> >>>>>>> [1] https://github.com/apache/beam/pull/13825 >>>>>>> [2] >>>>>>> https://github.com/apache/beam/blob/3fc2ab10d9f5d5c5b65ecf94ce45861857206674/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubSchemaIOProvider.java#L111 >>>>>>> [3] >>>>>>> https://github.com/apache/beam/blob/3fc2ab10d9f5d5c5b65ecf94ce45861857206674/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/meta/provider/pubsub/PubsubTableProviderIT.java#L108 >>>>>>> >>>>>>> On Thu, Mar 18, 2021 at 10:32 AM Fernando Morales Martinez < >>>>>>> fernando.mora...@wizeline.com> wrote: >>>>>>> >>>>>>>> Hi everyone, >>>>>>>> I'm working on the WI mentioned in the subject: "Add Proto support >>>>>>>> to Pubsub table provider" and I have a few questions . Sorry for the >>>>>>>> long >>>>>>>> mail! >>>>>>>> >>>>>>>> - The only method in KafkaTableProvider that performs some >>>>>>>> logic is buildBeamSqlTable. However, when taking a look at the >>>>>>>> tests in >>>>>>>> KafkaTableProviderIT, the only one that is calling that method >>>>>>>> appears to >>>>>>>> be testFake2. But if I’m not mistaken, that test doesn’t perform >>>>>>>> any test >>>>>>>> for the proto case. The only one that tests the proto case is >>>>>>>> testFake >>>>>>>> test, but that is only creating the KafkaTableProvider and that’s >>>>>>>> it. I >>>>>>>> wanted to base the PubSubTableProvider on the KafkaTableProvider >>>>>>>> since that >>>>>>>> one supports Proto and looks like it accomplishes that support by >>>>>>>> using >>>>>>>> PayloadSerializer. Is that correct? Should I follow that path? >>>>>>>> >>>>>>>> >>>>>>>> - I wanted to base the new code on the commits by Piotr, but a >>>>>>>> lot of the code he submitted appears to have been removed, but I >>>>>>>> can’t >>>>>>>> grasp the reason. There are several commits referencing one another >>>>>>>> and the >>>>>>>> Proto support. Can you shed some light on that? >>>>>>>> >>>>>>>> >>>>>>>> - Back to the PubsubTableProvider: PubSubIO, which contains >>>>>>>> reader/writer methods for protos (although via ProtoCoder), is >>>>>>>> being used >>>>>>>> by the PubsubSchemaIOProvider class for other read/write purposes. >>>>>>>> Why are >>>>>>>> the protos reader/writer not used? Because of the ProtoCoder? >>>>>>>> Should we >>>>>>>> instead implement new readers/writers by formatting the payload >>>>>>>> directly, >>>>>>>> in a similar fashion to readAvrosWithBeamSchema? >>>>>>>> >>>>>>>> Thanks a lot for the help! >>>>>>>> >>>>>>>> - Fernando Morales >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *This email and its contents (including any attachments) are being >>>>>>>> sent toyou on the condition of confidentiality and may be protected by >>>>>>>> legalprivilege. Access to this email by anyone other than the intended >>>>>>>> recipientis unauthorized. If you are not the intended recipient, please >>>>>>>> immediatelynotify the sender by replying to this message and delete the >>>>>>>> materialimmediately from your system. Any further use, dissemination, >>>>>>>> distributionor reproduction of this email is strictly prohibited. >>>>>>>> Further, >>>>>>>> norepresentation is made with respect to any content contained in this >>>>>>>> email.* >>>>>>> >>>>>>> >> >> >> >> >> >> >> >> *This email and its contents (including any attachments) are being sent >> toyou on the condition of confidentiality and may be protected by >> legalprivilege. Access to this email by anyone other than the intended >> recipientis unauthorized. If you are not the intended recipient, please >> immediatelynotify the sender by replying to this message and delete the >> materialimmediately from your system. Any further use, dissemination, >> distributionor reproduction of this email is strictly prohibited. Further, >> norepresentation is made with respect to any content contained in this >> email.* > >