Looking at Tim's PR.

On Tue, Jul 31, 2018 at 10:53 AM Ismaël Mejía <[email protected]> wrote:

> Reuven. I already started review and hope to finish later on today or
> tomorrow at latest. If you can, it would be good to take a look at Tim's PR
> that has been opened for longer time.
>
> On Tue, Jul 31, 2018, 6:36 PM Tim Robertson <[email protected]>
> wrote:
>
>> I took a pass at reviewing (non committer). I haven't worked on unbounded
>> IO so wasn't familiar enough with the timestamp and checkpointing but
>> otherwise it LGTM in general - thanks John and for applying the minor
>> suggestions.
>>
>> OT: Reuven, if you have time on your hands there is also the KuduIO
>> awaiting review (https://github.com/apache/beam/pull/6021)
>>
>>
>>
>>
>>
>> On Tue, Jul 31, 2018 at 5:07 PM, Reuven Lax <[email protected]> wrote:
>>
>>> Ismael, do you have time for this review? If you're too busy, I can try
>>> to help review it.
>>>
>>> John, unfortunately, as Ismael said, even if we speed up the review the
>>> 2.6.0 branch has already been cut, and we try and only cherry pick
>>> important bugfixes. Hopefully the next release will be soon, and it's also
>>> possible to use the nightly Beam releases in the interim.
>>>
>>> Reuven
>>>
>>> On Tue, Jul 31, 2018 at 5:14 AM Ismaël Mejía <[email protected]> wrote:
>>>
>>>> Hi, we can try to speed up the review, but the 2.6.0 branch was
>>>> already cut and was stabilizing for the last two weeks, so I am not
>>>> sure it will make it. Next release should be cut shortly hopefully in
>>>> 3-4 weeks to follow the 6 week release plan. Hope this can work for
>>>> you.
>>>>
>>>> On Tue, Jul 31, 2018 at 2:13 AM John Rudolf Lewis <[email protected]>
>>>> wrote:
>>>> >
>>>> > I created a pr for my SqsIO contribution. I look forward to your
>>>> comments.
>>>> >
>>>> > https://github.com/apache/beam/pull/6101
>>>> >
>>>> > Any chance this could be a part of the 2.6.0 release?
>>>> >
>>>> > On Thu, Jul 19, 2018 at 7:39 AM, John Rudolf Lewis <
>>>> [email protected]> wrote:
>>>> >>
>>>> >> Thank you.
>>>> >>
>>>> >> I've created a jira ticket to add SQS and have assigned it to
>>>> myself: https://issues.apache.org/jira/browse/BEAM-4828
>>>> >>
>>>> >> Modified the documentation to show it as in-progress:
>>>> https://github.com/apache/beam/pull/5995
>>>> >>
>>>> >> And will be starting my work here:
>>>> https://github.com/JohnRudolfLewis/beam/tree/Add-SqsIO
>>>> >>
>>>> >>
>>>> >> On Thu, Jul 19, 2018 at 1:43 AM, Jean-Baptiste Onofré <
>>>> [email protected]> wrote:
>>>> >>>
>>>> >>> Agree with Ismaël.
>>>> >>>
>>>> >>> I would be more than happy to help on this one (as I contributed on
>>>> AMQP
>>>> >>> and JMS IOs ;)).
>>>> >>>
>>>> >>> Regards
>>>> >>> JB
>>>> >>>
>>>> >>> On 19/07/2018 10:39, Ismaël Mejía wrote:
>>>> >>> > Thanks for your interest John, it would be a really nice
>>>> contribution
>>>> >>> > to add SQS support.
>>>> >>> >
>>>> >>> > Some context on the kinesis stuff:
>>>> >>> >
>>>> >>> > The reason why kinesis is still in a separate module is more
>>>> related
>>>> >>> > to a licensing problem. Kinesis uses some native libraries that
>>>> are
>>>> >>> > published under a not 100% apache compatible license and we are
>>>> not
>>>> >>> > allowed to shade and republish them but it seems there is a
>>>> workaround
>>>> >>> > now, for more details see
>>>> >>> > https://issues.apache.org/jira/browse/BEAM-3549
>>>> >>> > In any case if to use SQS you only need the Apache licensed
>>>> aws-sdk
>>>> >>> > deps it is ok (and a good idea) if you put it in the
>>>> >>> > amazon-web-services module.
>>>> >>> >
>>>> >>> > The kinesis connector is way more complex for multiple reasons,
>>>> first,
>>>> >>> > the raw version of the amazon client libraries is not so
>>>> ‘friendly’
>>>> >>> > and the guys who created KinesisIO had to do some workarounds to
>>>> >>> > provide accurate checkpointing/watermarks. So since SQS is a way
>>>> >>> > simpler system you should probably be ok basing it in simpler
>>>> sources
>>>> >>> > like AMQP or JMS.
>>>> >>> >
>>>> >>> > If you feel like to, please create the JIRA and don’t hesitate to
>>>> ask
>>>> >>> > questions if you find issues or if you need some review.
>>>> >>> >
>>>> >>> > On Thu, Jul 19, 2018 at 12:55 AM Lukasz Cwik <[email protected]>
>>>> wrote:
>>>> >>> >>
>>>> >>> >>
>>>> >>> >>
>>>> >>> >> On Wed, Jul 18, 2018 at 3:30 PM John Rudolf Lewis <
>>>> [email protected]> wrote:
>>>> >>> >>>
>>>> >>> >>> I need an SQS source for my project that is using beam. A brief
>>>> search did not turn up any in-progress work in this area. Please point me
>>>> to the right repo if I missed it.
>>>> >>> >>
>>>> >>> >>
>>>> >>> >> To my knowledge there is none and nobody has marked it in
>>>> progress on https://beam.apache.org/documentation/io/built-in/. It
>>>> would be good to create a JIRA issue on https://issues.apache.org/ and
>>>> send a PR to add SQS to the inprogress list referencing your JIRA. I added
>>>> you as a contributor in JIRA so you should be able to assign yourself to
>>>> any issues that you create.
>>>> >>> >>
>>>> >>> >>>
>>>> >>> >>> Assuming there is no in-progress effort, I would like to
>>>> contribute an Amazon SQS source. I have a few questions before I begin.
>>>> >>> >>
>>>> >>> >>
>>>> >>> >> Great, note that this is a good starting point for authoring an
>>>> IO transform:
>>>> https://beam.apache.org/documentation/io/authoring-overview/
>>>> >>> >>
>>>> >>> >>>
>>>> >>> >>>
>>>> >>> >>> It seems that the current AWS code is split into two different
>>>> modules: sdk/java/io/amazon-web-services which contains the S3FileSystem,
>>>> AwsOptions, etc, and sdk/java/io/kinesis which contains an unbounded source
>>>> based on a kinesis topic. I'd like to add this source to the
>>>> amazon-web-services module since I'd like to depend on AwsOptions. Does
>>>> adding this source to the amazon-web-services module make sense?
>>>> >>> >>
>>>> >>> >>
>>>> >>> >> Putting it inside of amazon-web-services makes a lot of sense.
>>>> The Google connectors all live within the one package and there has been
>>>> discussion to consolidate all the AWS stuff under amazon-web-services.
>>>> >>> >>
>>>> >>> >>>
>>>> >>> >>> Also, the kinesis source looks a touch more complex than other
>>>> sources. Both the JMS and AMQP sources look like better examples to follow.
>>>> Which existing source would be the best to model this contribution after?
>>>> >>> >>
>>>> >>> >>
>>>> >>> >> Some of it has to do with how many ways a source can be read and
>>>> how complicated the watermark tracking but it would be best if the IO
>>>> authors comment on implementation details.
>>>> >>> >>
>>>> >>> >>>
>>>> >>> >>> If anyone has put some thoughts into this, or better yet some
>>>> code, I'd appreciate hearing from you.
>>>> >>> >>>
>>>> >>> >>> Thanks!
>>>> >>> >>>
>>>> >>>
>>>> >>> --
>>>> >>> Jean-Baptiste Onofré
>>>> >>> [email protected]
>>>> >>> http://blog.nanthrax.net
>>>> >>> Talend - http://www.talend.com
>>>> >>
>>>> >>
>>>> >
>>>>
>>>
>>

Reply via email to