By "Relational" I mean things like: Column Pruning, Filter Pushdown, Table
Statistics, Partition Metadata, Metastore. We have a bunch of one-off
implementations in various IOs (mostly BigQueryIO) and have been waiting
for IO standards to push them out to all IOs. This was section "F5 -
Relational" from https://s.apache.org/beam-io-api-standard-documentation

On Thu, Dec 15, 2022 at 6:50 PM Herman Mak <herman...@google.com> wrote:

> Hey all,
>
> Firstly apologies for the confusion.
>
> The scope of this effort is to *finalize and have this added to the Beam
> public documentation* to be used as a PR reference once we have resolved
> the comments.
> YES this document is a continuation of the below docs with some additional
> components such as testing!
>
> The idea is to convert this to a MD file and add a page under "Developing
> new I/O connectors" with some small cleanup work around this area in other
> pages.
> [image: image.png]
>
>
>
>
> Docs that this is a continuation of:
> https://s.apache.org/beam-io-api-standard-documentation
> https://s.apache.org/beam-io-api-standard
>
>
> @Andrew Pilloud <apill...@google.com> Totally not intending to start from
> the beginning here, by relational do you mean having this hosting in the
> Beam confluence?
>
> Thanks all, and keep the feedback to the docs coming
>
> Herman Mak |  Customer Engineer, Hong Kong, Google Cloud |
> herman...@google.com |  +852-3923-5417 <+852%203923%205417>
>
>
>
>
>
> On Fri, Dec 16, 2022 at 1:36 AM Chamikara Jayalath <chamik...@google.com>
> wrote:
>
>>
>>
>> On Thu, Dec 15, 2022, 8:33 AM Alexey Romanenko <aromanenko....@gmail.com>
>> wrote:
>>
>>> Cham, do you remember what was a reason to not finalise that doc?
>>>
>>
>> I think this is a continuation of those docs (so we are trying to
>> finalize) but probably  Herman can explain better.
>>
>>
>>> Personally, I find having such standards very useful (if they are
>>> flexible during a time, of course), especially for new developers and PR
>>> reviewers, and it’d be great to finally have such doc as a part of
>>> contribution guide.
>>>
>>
>> +1
>>
>> Thanks,
>> Cham
>>
>>>
>>> —
>>> Alexey
>>>
>>> On 13 Dec 2022, at 04:32, Chamikara Jayalath via dev <
>>> dev@beam.apache.org> wrote:
>>>
>>> Yeah, I don't think either finalized or documented (in the Website) the
>>> previous iteration. This doc seems to contain details from the documents
>>> shared in the previous iteration.
>>>
>>> Thanks,
>>> Cham
>>>
>>>
>>>
>>> On Mon, Dec 12, 2022 at 6:49 PM Robert Burke <rob...@frantil.com> wrote:
>>>
>>>> I think ultimately: until the docs a clearly available on the Beam site
>>>> itself, it's not documentation. See also, design docs, previous emails, and
>>>> similar.
>>>>
>>>> On Mon, Dec 12, 2022, 6:07 PM Andrew Pilloud via dev <
>>>> dev@beam.apache.org> wrote:
>>>>
>>>>> I believe the previous iteration was here:
>>>>> https://lists.apache.org/thread/3o8glwkn70kqjrf6wm4dyf8bt27s52hk
>>>>>
>>>>> The associated docs are:
>>>>> https://s.apache.org/beam-io-api-standard-documentation
>>>>> https://s.apache.org/beam-io-api-standard
>>>>>
>>>>> This is missing all the relational stuff that was in those docs, this
>>>>> appears to be another attempt starting from the beginning?
>>>>>
>>>>> Andrew
>>>>>
>>>>>
>>>>> On Mon, Dec 12, 2022 at 9:57 AM Alexey Romanenko <
>>>>> aromanenko....@gmail.com> wrote:
>>>>>
>>>>>> Thanks for writing this!
>>>>>>
>>>>>> IIRC, the similar design doc was sent for review here a while ago. Is
>>>>>> this just an updated version and a new one?
>>>>>>
>>>>>> —
>>>>>> Alexey
>>>>>>
>>>>>> On 11 Dec 2022, at 15:16, Herman Mak via dev <dev@beam.apache.org>
>>>>>> wrote:
>>>>>>
>>>>>> Hello Everyone,
>>>>>>
>>>>>> *TLDR*
>>>>>>
>>>>>> Should we adopt a set of standards that Connector I/Os should adhere
>>>>>> to?
>>>>>> Attached is a first version of a Beam I/O Standards guideline that
>>>>>> includes opinionated best practices across important components of a
>>>>>> Connector I/O, namely Documentation, Development and Testing.
>>>>>>
>>>>>> *The Long Version*
>>>>>>
>>>>>> Apache Beam is a unified open-source programming model for both batch
>>>>>> and streaming. It runs on multiple platform runners and integrates with
>>>>>> over 50 services using individually developed I/O Connectors
>>>>>> <https://beam.apache.org/documentation/io/connectors/>.
>>>>>>
>>>>>> Given that Apache Beam connectors are written by many different
>>>>>> developers and at varying points in time, they vary in syntax style,
>>>>>> documentation completeness and testing done. For a new adopter of Apache
>>>>>> Beam, that can definitely cause some uncertainty.
>>>>>>
>>>>>> So should we adopt a set of standards that Connector I/Os should
>>>>>> adhere to?
>>>>>> Attached is a first version, in Doc format, of a Beam I/O Standards
>>>>>> guideline that includes opinionated best practices across important
>>>>>> components of a Connector I/O, namely Documentation, Development and
>>>>>> Testing. And the aim is to incorporate this into the documentation and to
>>>>>> have it referenced as standards for new Connector I/Os (and ideally have
>>>>>> existing Connectors upgraded over time). If it looks helpful, the 
>>>>>> immediate
>>>>>> next step is that we can convert it into a .md as a PR into the Beam 
>>>>>> repo!
>>>>>>
>>>>>> Thanks and looking forward to feedbacks and discussion,
>>>>>>
>>>>>>  [PUBLIC] Beam I/O Standards
>>>>>> <https://docs.google.com/document/d/1BCTpSZDUjK90hYZjcn8aAnPd9vuRfj8YU1j3mpSgRwI/edit?usp=drive_web>
>>>>>>
>>>>>> Herman Mak |  Customer Engineer, Hong Kong, Google Cloud |
>>>>>> herman...@google.com |  +852-3923-5417 <+852%203923%205417>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>

Reply via email to