Thanks for offering to help. I would suggest to look into existing Java BigTableIO connector and currently available Python client library for Cloud BigTable to see how feasible it is to develop an efficient BigTable connector at this point. From Python SDK's perspective you can use iobase.BoundedSource API (wrapped by a PTrasnform) to develop a read PTransform with support for dynamic/static splitting. Sinks are usually developed as PTransforms (iobase.Sink interface is deprecated so I suggest not to use that). I would be happy to review any PRs related to this.
Thanks, Cham On Sun, May 28, 2017 at 2:30 AM Matthias Baetens < [email protected]> wrote: > Hey guys, > > We have been using Beam for quite a few months now, so we (my colleague > Robert & I) thought it might be cool to contribute a bit as well. > > The challenge we want to take up is writing the BigTableIO for the Python > SDK (which is not yet in the works according to the website > < > https://github.com/apache/beam-site/blob/asf-site/src/documentation/io/built-in.md > >. > I have searched JIRA for the BigTableIO issue and did not find it, so I > suppose this is the first step we take. > > Any pointers or feedback more than welcome! > > Best, > > Matthias >
