By "Relational" I mean things like: Column Pruning, Filter Pushdown, Table Statistics, Partition Metadata, Metastore. We have a bunch of one-off implementations in various IOs (mostly BigQueryIO) and have been waiting for IO standards to push them out to all IOs. This was section "F5 - Relational" from https://s.apache.org/beam-io-api-standard-documentation
On Thu, Dec 15, 2022 at 6:50 PM Herman Mak <herman...@google.com> wrote: > Hey all, > > Firstly apologies for the confusion. > > The scope of this effort is to *finalize and have this added to the Beam > public documentation* to be used as a PR reference once we have resolved > the comments. > YES this document is a continuation of the below docs with some additional > components such as testing! > > The idea is to convert this to a MD file and add a page under "Developing > new I/O connectors" with some small cleanup work around this area in other > pages. > [image: image.png] > > > > > Docs that this is a continuation of: > https://s.apache.org/beam-io-api-standard-documentation > https://s.apache.org/beam-io-api-standard > > > @Andrew Pilloud <apill...@google.com> Totally not intending to start from > the beginning here, by relational do you mean having this hosting in the > Beam confluence? > > Thanks all, and keep the feedback to the docs coming > > Herman Mak | Customer Engineer, Hong Kong, Google Cloud | > herman...@google.com | +852-3923-5417 <+852%203923%205417> > > > > > > On Fri, Dec 16, 2022 at 1:36 AM Chamikara Jayalath <chamik...@google.com> > wrote: > >> >> >> On Thu, Dec 15, 2022, 8:33 AM Alexey Romanenko <aromanenko....@gmail.com> >> wrote: >> >>> Cham, do you remember what was a reason to not finalise that doc? >>> >> >> I think this is a continuation of those docs (so we are trying to >> finalize) but probably Herman can explain better. >> >> >>> Personally, I find having such standards very useful (if they are >>> flexible during a time, of course), especially for new developers and PR >>> reviewers, and it’d be great to finally have such doc as a part of >>> contribution guide. >>> >> >> +1 >> >> Thanks, >> Cham >> >>> >>> — >>> Alexey >>> >>> On 13 Dec 2022, at 04:32, Chamikara Jayalath via dev < >>> dev@beam.apache.org> wrote: >>> >>> Yeah, I don't think either finalized or documented (in the Website) the >>> previous iteration. This doc seems to contain details from the documents >>> shared in the previous iteration. >>> >>> Thanks, >>> Cham >>> >>> >>> >>> On Mon, Dec 12, 2022 at 6:49 PM Robert Burke <rob...@frantil.com> wrote: >>> >>>> I think ultimately: until the docs a clearly available on the Beam site >>>> itself, it's not documentation. See also, design docs, previous emails, and >>>> similar. >>>> >>>> On Mon, Dec 12, 2022, 6:07 PM Andrew Pilloud via dev < >>>> dev@beam.apache.org> wrote: >>>> >>>>> I believe the previous iteration was here: >>>>> https://lists.apache.org/thread/3o8glwkn70kqjrf6wm4dyf8bt27s52hk >>>>> >>>>> The associated docs are: >>>>> https://s.apache.org/beam-io-api-standard-documentation >>>>> https://s.apache.org/beam-io-api-standard >>>>> >>>>> This is missing all the relational stuff that was in those docs, this >>>>> appears to be another attempt starting from the beginning? >>>>> >>>>> Andrew >>>>> >>>>> >>>>> On Mon, Dec 12, 2022 at 9:57 AM Alexey Romanenko < >>>>> aromanenko....@gmail.com> wrote: >>>>> >>>>>> Thanks for writing this! >>>>>> >>>>>> IIRC, the similar design doc was sent for review here a while ago. Is >>>>>> this just an updated version and a new one? >>>>>> >>>>>> — >>>>>> Alexey >>>>>> >>>>>> On 11 Dec 2022, at 15:16, Herman Mak via dev <dev@beam.apache.org> >>>>>> wrote: >>>>>> >>>>>> Hello Everyone, >>>>>> >>>>>> *TLDR* >>>>>> >>>>>> Should we adopt a set of standards that Connector I/Os should adhere >>>>>> to? >>>>>> Attached is a first version of a Beam I/O Standards guideline that >>>>>> includes opinionated best practices across important components of a >>>>>> Connector I/O, namely Documentation, Development and Testing. >>>>>> >>>>>> *The Long Version* >>>>>> >>>>>> Apache Beam is a unified open-source programming model for both batch >>>>>> and streaming. It runs on multiple platform runners and integrates with >>>>>> over 50 services using individually developed I/O Connectors >>>>>> <https://beam.apache.org/documentation/io/connectors/>. >>>>>> >>>>>> Given that Apache Beam connectors are written by many different >>>>>> developers and at varying points in time, they vary in syntax style, >>>>>> documentation completeness and testing done. For a new adopter of Apache >>>>>> Beam, that can definitely cause some uncertainty. >>>>>> >>>>>> So should we adopt a set of standards that Connector I/Os should >>>>>> adhere to? >>>>>> Attached is a first version, in Doc format, of a Beam I/O Standards >>>>>> guideline that includes opinionated best practices across important >>>>>> components of a Connector I/O, namely Documentation, Development and >>>>>> Testing. And the aim is to incorporate this into the documentation and to >>>>>> have it referenced as standards for new Connector I/Os (and ideally have >>>>>> existing Connectors upgraded over time). If it looks helpful, the >>>>>> immediate >>>>>> next step is that we can convert it into a .md as a PR into the Beam >>>>>> repo! >>>>>> >>>>>> Thanks and looking forward to feedbacks and discussion, >>>>>> >>>>>> [PUBLIC] Beam I/O Standards >>>>>> <https://docs.google.com/document/d/1BCTpSZDUjK90hYZjcn8aAnPd9vuRfj8YU1j3mpSgRwI/edit?usp=drive_web> >>>>>> >>>>>> Herman Mak | Customer Engineer, Hong Kong, Google Cloud | >>>>>> herman...@google.com | +852-3923-5417 <+852%203923%205417> >>>>>> >>>>>> >>>>>> >>>>>> >>>