Thx John ! Regards JB
On 19/07/2018 16:39, John Rudolf Lewis wrote: > Thank you. > > I've created a jira ticket to add SQS and have assigned it to > myself: https://issues.apache.org/jira/browse/BEAM-4828 > > Modified the documentation to show it as in-progress: > https://github.com/apache/beam/pull/5995 > > And will be starting my work > here: https://github.com/JohnRudolfLewis/beam/tree/Add-SqsIO > > On Thu, Jul 19, 2018 at 1:43 AM, Jean-Baptiste Onofré <[email protected] > <mailto:[email protected]>> wrote: > > Agree with Ismaël. > > I would be more than happy to help on this one (as I contributed on AMQP > and JMS IOs ;)). > > Regards > JB > > On 19/07/2018 10:39, Ismaël Mejía wrote: > > Thanks for your interest John, it would be a really nice contribution > > to add SQS support. > > > > Some context on the kinesis stuff: > > > > The reason why kinesis is still in a separate module is more related > > to a licensing problem. Kinesis uses some native libraries that are > > published under a not 100% apache compatible license and we are not > > allowed to shade and republish them but it seems there is a workaround > > now, for more details see > > https://issues.apache.org/jira/browse/BEAM-3549 > <https://issues.apache.org/jira/browse/BEAM-3549> > > In any case if to use SQS you only need the Apache licensed aws-sdk > > deps it is ok (and a good idea) if you put it in the > > amazon-web-services module. > > > > The kinesis connector is way more complex for multiple reasons, first, > > the raw version of the amazon client libraries is not so ‘friendly’ > > and the guys who created KinesisIO had to do some workarounds to > > provide accurate checkpointing/watermarks. So since SQS is a way > > simpler system you should probably be ok basing it in simpler sources > > like AMQP or JMS. > > > > If you feel like to, please create the JIRA and don’t hesitate to ask > > questions if you find issues or if you need some review. > > > > On Thu, Jul 19, 2018 at 12:55 AM Lukasz Cwik <[email protected] > <mailto:[email protected]>> wrote: > >> > >> > >> > >> On Wed, Jul 18, 2018 at 3:30 PM John Rudolf Lewis > <[email protected] <mailto:[email protected]>> wrote: > >>> > >>> I need an SQS source for my project that is using beam. A brief > search did not turn up any in-progress work in this area. Please > point me to the right repo if I missed it. > >> > >> > >> To my knowledge there is none and nobody has marked it in > progress on https://beam.apache.org/documentation/io/built-in/ > <https://beam.apache.org/documentation/io/built-in/>. It would be > good to create a JIRA issue on https://issues.apache.org/ and send a > PR to add SQS to the inprogress list referencing your JIRA. I added > you as a contributor in JIRA so you should be able to assign > yourself to any issues that you create. > >> > >>> > >>> Assuming there is no in-progress effort, I would like to > contribute an Amazon SQS source. I have a few questions before I begin. > >> > >> > >> Great, note that this is a good starting point for authoring an > IO transform: > https://beam.apache.org/documentation/io/authoring-overview/ > <https://beam.apache.org/documentation/io/authoring-overview/> > >> > >>> > >>> > >>> It seems that the current AWS code is split into two different > modules: sdk/java/io/amazon-web-services which contains the > S3FileSystem, AwsOptions, etc, and sdk/java/io/kinesis which > contains an unbounded source based on a kinesis topic. I'd like to > add this source to the amazon-web-services module since I'd like to > depend on AwsOptions. Does adding this source to the > amazon-web-services module make sense? > >> > >> > >> Putting it inside of amazon-web-services makes a lot of sense. > The Google connectors all live within the one package and there has > been discussion to consolidate all the AWS stuff under > amazon-web-services. > >> > >>> > >>> Also, the kinesis source looks a touch more complex than other > sources. Both the JMS and AMQP sources look like better examples to > follow. Which existing source would be the best to model this > contribution after? > >> > >> > >> Some of it has to do with how many ways a source can be read and > how complicated the watermark tracking but it would be best if the > IO authors comment on implementation details. > >> > >>> > >>> If anyone has put some thoughts into this, or better yet some > code, I'd appreciate hearing from you. > >>> > >>> Thanks! > >>> > > -- > Jean-Baptiste Onofré > [email protected] <mailto:[email protected]> > http://blog.nanthrax.net > Talend - http://www.talend.com > > -- Jean-Baptiste Onofré [email protected] http://blog.nanthrax.net Talend - http://www.talend.com
