Understood. Thank you. On Tue, Jul 31, 2018 at 5:13 AM, Ismaël Mejía <[email protected]> wrote:
> Hi, we can try to speed up the review, but the 2.6.0 branch was > already cut and was stabilizing for the last two weeks, so I am not > sure it will make it. Next release should be cut shortly hopefully in > 3-4 weeks to follow the 6 week release plan. Hope this can work for > you. > > On Tue, Jul 31, 2018 at 2:13 AM John Rudolf Lewis <[email protected]> > wrote: > > > > I created a pr for my SqsIO contribution. I look forward to your > comments. > > > > https://github.com/apache/beam/pull/6101 > > > > Any chance this could be a part of the 2.6.0 release? > > > > On Thu, Jul 19, 2018 at 7:39 AM, John Rudolf Lewis <[email protected]> > wrote: > >> > >> Thank you. > >> > >> I've created a jira ticket to add SQS and have assigned it to myself: > https://issues.apache.org/jira/browse/BEAM-4828 > >> > >> Modified the documentation to show it as in-progress: > https://github.com/apache/beam/pull/5995 > >> > >> And will be starting my work here: https://github.com/ > JohnRudolfLewis/beam/tree/Add-SqsIO > >> > >> > >> On Thu, Jul 19, 2018 at 1:43 AM, Jean-Baptiste Onofré <[email protected]> > wrote: > >>> > >>> Agree with Ismaël. > >>> > >>> I would be more than happy to help on this one (as I contributed on > AMQP > >>> and JMS IOs ;)). > >>> > >>> Regards > >>> JB > >>> > >>> On 19/07/2018 10:39, Ismaël Mejía wrote: > >>> > Thanks for your interest John, it would be a really nice contribution > >>> > to add SQS support. > >>> > > >>> > Some context on the kinesis stuff: > >>> > > >>> > The reason why kinesis is still in a separate module is more related > >>> > to a licensing problem. Kinesis uses some native libraries that are > >>> > published under a not 100% apache compatible license and we are not > >>> > allowed to shade and republish them but it seems there is a > workaround > >>> > now, for more details see > >>> > https://issues.apache.org/jira/browse/BEAM-3549 > >>> > In any case if to use SQS you only need the Apache licensed aws-sdk > >>> > deps it is ok (and a good idea) if you put it in the > >>> > amazon-web-services module. > >>> > > >>> > The kinesis connector is way more complex for multiple reasons, > first, > >>> > the raw version of the amazon client libraries is not so ‘friendly’ > >>> > and the guys who created KinesisIO had to do some workarounds to > >>> > provide accurate checkpointing/watermarks. So since SQS is a way > >>> > simpler system you should probably be ok basing it in simpler sources > >>> > like AMQP or JMS. > >>> > > >>> > If you feel like to, please create the JIRA and don’t hesitate to ask > >>> > questions if you find issues or if you need some review. > >>> > > >>> > On Thu, Jul 19, 2018 at 12:55 AM Lukasz Cwik <[email protected]> > wrote: > >>> >> > >>> >> > >>> >> > >>> >> On Wed, Jul 18, 2018 at 3:30 PM John Rudolf Lewis < > [email protected]> wrote: > >>> >>> > >>> >>> I need an SQS source for my project that is using beam. A brief > search did not turn up any in-progress work in this area. Please point me > to the right repo if I missed it. > >>> >> > >>> >> > >>> >> To my knowledge there is none and nobody has marked it in progress > on https://beam.apache.org/documentation/io/built-in/. It would be good > to create a JIRA issue on https://issues.apache.org/ and send a PR to add > SQS to the inprogress list referencing your JIRA. I added you as a > contributor in JIRA so you should be able to assign yourself to any issues > that you create. > >>> >> > >>> >>> > >>> >>> Assuming there is no in-progress effort, I would like to > contribute an Amazon SQS source. I have a few questions before I begin. > >>> >> > >>> >> > >>> >> Great, note that this is a good starting point for authoring an IO > transform: https://beam.apache.org/documentation/io/authoring-overview/ > >>> >> > >>> >>> > >>> >>> > >>> >>> It seems that the current AWS code is split into two different > modules: sdk/java/io/amazon-web-services which contains the S3FileSystem, > AwsOptions, etc, and sdk/java/io/kinesis which contains an unbounded source > based on a kinesis topic. I'd like to add this source to the > amazon-web-services module since I'd like to depend on AwsOptions. Does > adding this source to the amazon-web-services module make sense? > >>> >> > >>> >> > >>> >> Putting it inside of amazon-web-services makes a lot of sense. The > Google connectors all live within the one package and there has been > discussion to consolidate all the AWS stuff under amazon-web-services. > >>> >> > >>> >>> > >>> >>> Also, the kinesis source looks a touch more complex than other > sources. Both the JMS and AMQP sources look like better examples to follow. > Which existing source would be the best to model this contribution after? > >>> >> > >>> >> > >>> >> Some of it has to do with how many ways a source can be read and > how complicated the watermark tracking but it would be best if the IO > authors comment on implementation details. > >>> >> > >>> >>> > >>> >>> If anyone has put some thoughts into this, or better yet some > code, I'd appreciate hearing from you. > >>> >>> > >>> >>> Thanks! > >>> >>> > >>> > >>> -- > >>> Jean-Baptiste Onofré > >>> [email protected] > >>> http://blog.nanthrax.net > >>> Talend - http://www.talend.com > >> > >> > > >
