Re: [DISCUSS] KIP-310: Add a Kafka Source Connector to Kafka Connect

2018-10-25 Thread McCaig, Rhys
Hi All,

Based on the feedback in this thread, and in light of Ryanne’s excellent 
proposal (KIP-382: MirrorMaker 2.0) which incorporates and extends the goals of 
KIP-310, I have updated the status of KIP-310 to “Discarded" and added a 
comment that KIP-382 supersedes it.

Thank you all for the discussion and feedback - this is my first KIP and I 
appreciate the community providing feedback on my contributions!

Rhys

> On Sep 26, 2018, at 10:42 AM, Konstantine Karantasis 
>  wrote:
> 
> Hi Rhys,
> 
> thanks for the proposal and apologies for the late feedback. Utilizing
> Connect to mirror Kafka topics is definitely a plausible proposal for a
> very useful use case.
> 
> However, I don't think the apache/kafka repository is the right place to
> host such a Connector. Currently, no full-featured, production-ready
> connectors are hosted in AK. The only two connectors shipped with AK
> (FileStreamSourceConnector and FileStreamSinkConnector) are there to
> demonstrate implementations only as examples.
> 
> I find this approach very appealing. AK focuses on providing the core
> infrastructure for Connect, that is required in every Kafka Connect
> deployment, as well as offering the means to generically install, deploy
> and operate connectors. But all the connectors reside outside AK and
> comprise a vibrant ecosystem of open source and proprietary components
> that, essentially - even for the most useful and ubiquitous of the
> connectors - are optional for users to install and use. This seems simple
> and flexible, both in terms of releasing and using/deploying software
> related to Kafka Connect. I might even say that I'd be in favor of
> extending this approach to all the Connect components, including
> Transformations and Converters.
> 
> I'm aware that MirrorMaker is part of AK, but to me this refers to the
> early days of Apache Kafka, when the size of the project and the ecosystem
> was smaller, Connect and Streams had not been implemented yet, and
> mirroring topics between Kafka clusters was already a basic need. With a
> much more rich ecosystem now and more sizable and well defined packages in
> AK, I think the approach that decouples connectors from the Connect
> framework itself is a good one.
> 
> In my opinion, the fact that this connector targets Kafka itself as a
> source is not an adequate reason to include it in apache/kafka within the
> Connect framework. It seems it can evolve naturally, as every other
> connector, in its own repository.
> 
> Regards,
> Konstantine
> 
> 
> On Sat, Aug 4, 2018 at 7:20 PM McCaig, Rhys  wrote:
> 
>> Hi All,
>> 
>> If there are no further comments on this KIP I’ll start a vote early this
>> week.
>> 
>> Rhys
>> 
>> On Aug 1, 2018, at 12:32 AM, McCaig, Rhys > > wrote:
>> 
>> Hi All,
>> 
>> I’ve updated the proposal to include the improvements suggested by
>> Stephane.
>> 
>> I have also submitted a PR to implement this functionality into Kafka.
>> https://github.com/apache/kafka/pull/5438
>> 
>> I don’t have a benchmark against MirrorMaker yet, as I only currently have
>> a local docker stack available to me, though I have seen very good
>> performance in that test stack (200k messages/sec@100bytes on limited
>> compute resource containers). Further benchmarking might take a few days.
>> 
>> Review and comments would be appreciated.
>> 
>> Cheers,
>> Rhys
>> 
>> 
>> On Jun 18, 2018, at 9:00 AM, McCaig, Rhys > > wrote:
>> 
>> Hi Stephane,
>> 
>> Thanks for your feedback and apologies for the delay in my response.
>> 
>> Are there any performance benchmarks against Mirror Maker available? I'm
>> interested to know if this is more performant / scalable.
>> Regarding the implementation, here's some feedback:
>> 
>> 
>> Currently I don’t have any performance benchmarks, but I think this is a
>> great idea, ill see if I can set up something one the next week or so.
>> 
>> - I think it's worth mentioning that this solution does not rely on
>> consumer groups, and therefore tracking progress may be tricky. Can you
>> think of a way to expose that?
>> 
>> This is a reasonable concern. I’m not sure how to track this other than
>> looking at the Kafka connect offsets. Once a messages is passed to the
>> framework, I'm unaware of a way to get at the commit offsets on the
>> producer side. Any thoughts?
>> 
>> - Some code can be in config Validator I believe:
>> 
>> https://github.com/Comcast/MirrorTool-for-Kafka-Connect/blob/master/src/main/java/com/comcast/kafka/connect/kafka/KafkaSourceConnector.java#L47
>> 
>> - I think your kip mentions `source.admin.` and `source.consumer.` but I
>> don't see it reflected yet in the code
>> 
>> - Is there a way to be flexible and merge list and regex, or offer the two
>> simultaneously ? source_topics=my_static_topic,prefix.* ?
>> 
>> Agree on all of the above - I will incorporate into the code later this
>> week as ill get some time back to work on this.

Re: [DISCUSS] KIP-310: Add a Kafka Source Connector to Kafka Connect

2018-06-06 Thread Stephane Maarek
Hi Rhys,

I think this will be a great addition.

Are there any performance benchmarks against Mirror Maker available? I'm
interested to know if this is more performant / scalable.
Regarding the implementation, here's some feedback:

- I think it's worth mentioning that this solution does not rely on
consumer groups, and therefore tracking progress may be tricky. Can you
think of a way to expose that?

- Some code can be in config Validator I believe:
https://github.com/Comcast/MirrorTool-for-Kafka-Connect/blob/master/src/main/java/com/comcast/kafka/connect/kafka/KafkaSourceConnector.java#L47

- I think your kip mentions `source.admin.` and `source.consumer.` but I
don't see it reflected yet in the code

- Is there a way to be flexible and merge list and regex, or offer the two
simultaneously ? source_topics=my_static_topic,prefix.* ?

Hope that helps
Stephane

Kind regards,
Stephane

[image: Simple Machines]

Stephane Maarek | Developer

+61 416 575 980
steph...@simplemachines.com.au
simplemachines.com.au
Level 2, 145 William Street, Sydney NSW 2010

On 5 June 2018 at 09:04, McCaig, Rhys  wrote:

> Hi All,
>
> As I didn’t get any comment on this KIP and there has since been an
> additional 2 KIP’s created numbered 308 since, I'm bumping this and
> renaming the KIP to 310 to remove the duplication:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 310%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect
>
> Let me know if you have any comments or feedback, would love to hear them.
>
> Cheers,
> Rhys
>
> > On May 28, 2018, at 10:23 PM, McCaig, Rhys 
> wrote:
> >
> > Sorry for the bad link to the KIP, here it is: https://cwiki.apache.org/
> confluence/display/KAFKA/KIP-308%3A+Add+a+Kafka+Source+
> Connector+to+Kafka+Connect
> >
> >> On May 28, 2018, at 10:19 PM, McCaig, Rhys 
> wrote:
> >>
> >> Hi All,
> >>
> >> I added a KIP to include a Kafka Source Connector with Kafka Connect.
> >> Here is the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 308%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect ps://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 308:+Add+a+Kafka+Source+Connector+to+Kafka+Connect>
> >>
> >> Looking forward to your feedback and suggestions.
> >>
> >> Cheers,
> >> Rhys
> >>
> >>
> >
>
>