Re: Cleaning up command line tools argument parsing a little

2019-04-17 Thread McCaig, Rhys
Sönke - I think this is a great idea. I’d be willing to help out where I can if 
there is a new issue to track this.

> On Apr 17, 2019, at 9:02 AM, Sönke Liebau 
>  wrote:
> 
> I actually have a theory how that came about.
> 
> All classes that use argparse4j are situated in the tools and connect
> projects, which doesn't have a dependency on core. But that's where all the
> CommandLine stuff that uses joptsimple is located. So to gain access to
> that (not joptsimple itself, but all the helper classes) would have meant
> adding a dependency on core to those - that may have triggered the search
> for something else that ended up with argparse4j.
> 
> Not sure if that was what happened, but so far its the only reason I could
> come up with.
> 
> On Wed, Apr 17, 2019 at 5:55 PM Guozhang Wang  wrote:
> 
>> I took another look at the PR itself and I think it would be great to have
>> this cleanup too -- I cannot remember at the beginning why we gradually
>> moved to different mechanism (argparse4j) for different cmds, if there's no
>> rationales behind it we should just make them consistent.
>> 
>> Thanks for driving this!
>> 
>> Guozhang
>> 
>> On Wed, Apr 17, 2019 at 7:19 AM Ryanne Dolan 
>> wrote:
>> 
>>> Sönke, I'd find this very helpful. It's annoying to keep track of which
>>> commands use which form -- I always seem to guess wrong.
>>> 
>>> Though I don't think there is any reason to deprecate existing forms,
>> e.g.
>>> consumer.config vs consumer-config. I think it's perfectly reasonable to
>>> have multiple spellings of the same arguments. I don't really see a
>>> downside to keeping the aliases around indefinitely.
>>> 
>>> Ryanne
>>> 
>>> 
>>> 
>>> 
>>> On Wed, Apr 17, 2019, 7:07 AM Sönke Liebau
>>>  wrote:
>>> 
 Hi everybody,
 
 Jason and I were recently discussing command line argument parsing on
 KAFKA-8131 (or rather the related pull request) [1].
 
 Command line tools and their arguments are somewhat diverse at the
>>> moment.
 Most of the tools use joptsimple for argument parsing, some newer java
 tools use argparse4j instead and some tools use nothing at all.
 I've looked for a reason as to why there are two libraries being used,
>>> but
 couldn't really find anything. Paolo brought up the same question on
>> the
 mailing list a while back [7], but got no response either.
 Does anybody know why this is the case?
 
 This results in no central place to add universal parameters like help
>>> and
 version, as well as the help output looking different between some of
>> the
 tools.
 Also, there are a number of parameters that should be renamed to adhere
>>> to
 defaults.
 
 There have been a few discussions and initiatives around this in the
>>> past.
 Just of the top of my head (and a 5 minute jira search) there are:
 - KIP-14 [2]
 - KAFKA-2111 [3]
 - KIP-316 [4]
 - KAFKA-1292 [5]
 - KAFKA-3530 [6]
 - and probably many more
 
 Would people generally be in favor of revisiting this topic?
 
 What I'd propose to do is:
 - comb through jira and KIPs, clean up old stuff and creae a new
>> umbrella
 issue to track this  (maybe reuse KIP-4 as well)
 - agree on one library for parsing command line arguments (don't care
>>> which
 one, but two is one too many I think)
 - refactor tools to use one library and default way of argument parsing
 with central help and version parameter
 - add aliases for options that should be renamed according to KIP-4
>> (and
 maybe others) so that both new and old work for a while, deprecate old
 parameters for a cycle or two and then remove them
 
 I'll shut up now and see if people would consider this useful or have
>> any
 other input :)
 
 Best regards,
 Sönke
 
 [1] https://github.com/apache/kafka/pull/6481#discussion_r273773003
 
 [2]
 
 
>>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-14+-+Tools+Standardization
 [3] https://issues.apache.org/jira/browse/KAFKA-2111
 [4]
 
 
>>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-316%3A+Command-line+overrides+for+ConnectDistributed+worker+properties
 [5] https://issues.apache.org/jira/browse/KAFKA-1292
 [6] https://issues.apache.org/jira/browse/KAFKA-3530
 [7]
 
 
>>> 
>> https://sematext.com/opensee/m/Kafka/uyzND10ObP01p77VS?subj=From+Scala+to+Java+based+tools+joptsimple+vs+argparse4j
 
>>> 
>> 
>> 
>> --
>> -- Guozhang
>> 
> 
> 
> -- 
> Sönke Liebau
> Partner
> Tel. +49 179 7940878
> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany



Re: [VOTE] KIP-415: Incremental Cooperative Rebalancing in Kafka Connect

2019-03-06 Thread McCaig, Rhys
+1 (non-binding)

> On Mar 6, 2019, at 3:40 PM, Ryanne Dolan  wrote:
> 
> +1 (non-binding)
> 
> Thanks!
> Ryanne
> 
> On Wed, Mar 6, 2019, 4:28 PM Konstantine Karantasis <
> konstant...@confluent.io> wrote:
> 
>> I'd like to open the vote on KIP-415: Incremental Cooperative Rebalancing
>> in Kafka Connect
>> 
>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-415%3A+Incremental+Cooperative+Rebalancing+in+Kafka+Connect
>> 
>> a proposal that will allow Kafka Connect to scale significantly the number
>> of connectors and tasks it can run in a cluster of Connect workers.
>> 
>> Thanks,
>> Konstantine
>> 



Re: Statestore restoration & scaling questions - possible KIP as well.

2019-02-07 Thread McCaig, Rhys
Adam,

I don’t have an answer for you but I would also be interested in clarification 
of this process if anyone can provide more details. If your reading is correct 
I would welcome the KIP to reduce the scaling pauses.

Cheers,
Rhys McCaig

> On Feb 6, 2019, at 7:44 AM, Adam Bellemare  wrote:
> 
> Bump - hoping someone has some insight. Alternately, redirection to a more
> suitable forum.
> 
> Thanks
> 
> On Sun, Feb 3, 2019 at 10:25 AM Adam Bellemare 
> wrote:
> 
>> Hey Folks
>> 
>> I have a few questions around the operations of stateful processing while
>> scaling nodes up/down, and a possible KIP in question #4. Most of them have
>> to do with task processing during rebuilding of state stores after scaling
>> nodes up.
>> 
>> Scenario:
>> Single node/thread, processing 2 topics (10 partitions each):
>> User event topic (events) - ie: key:userId, value: ProductId
>> Product topic (entity) - ie: key: ProductId, value: productData
>> 
>> My topology looks like this:
>> 
>> KTable productTable = ... //materialize from product topic
>> 
>> KStream output = userStream
>>.map(x => (x.value, x.key) ) //Swap the key and value around
>>.join(productTable, ... ) //Joiner is not relevant here
>>.to(...)  //Send it to some output topic
>> 
>> 
>> Here are my questions:
>> 1) If I scale the processing node count up, partitions will be rebalanced
>> to the new node. Does processing continue as normal on the original node,
>> while the new node's processing is paused as the internal state stores are
>> rebuilt/reloaded? From my reading of the code (and own experience) I
>> believe this to be the case, but I am just curious in case I missed
>> something.
>> 
>> 2) What happens to the userStream map task? Will the new node be able to
>> process this task while the state store is rebuilding/reloading? My reading
>> of the code suggests that this map process will be paused on the new node
>> while the state store is rebuilt. The effect of this is that it will lead
>> to a delay in events reaching the original node's partitions, which will be
>> seen as late-arriving events. Am I right in this assessment?
>> 
>> 3) How does scaling up work with standby state-store replicas? From my
>> reading of the code, it appears that scaling a node up will result in a
>> reabalance, with the state assigned to the new node being rebuilt first
>> (leading to a pause in processing). Following this, the standy replicas are
>> populated. Am I correct in this reading?
>> 
>> 4) If my reading in #3 is correct, would it be possible to pre-populate
>> the standby stores on scale-up before initiating active-task transfer? This
>> would allow seamless scale-up and scale-down without requiring any pauses
>> for rebuilding state. I am interested in kicking this off as a KIP if so,
>> but would appreciate any JIRAs or related KIPs to read up on prior to
>> digging into this.
>> 
>> 
>> Thanks
>> 
>> Adam Bellemare
>> 



Re: [EXTERNAL] [VOTE] KIP-382 MirrorMaker 2.0

2019-01-01 Thread McCaig, Rhys
+1 (non-binding). Fantastic work on the KIP Ryanne.

> On Dec 25, 2018, at 9:10 AM, Stephane Maarek  
> wrote:
> 
> +1 ! Great stuff
> 
> Stephane
> 
> On Mon., 24 Dec. 2018, 12:07 pm Edoardo Comar  
>> +1 non-binding
>> 
>> thanks for the KIP
>> --
>> 
>> Edoardo Comar
>> 
>> IBM Event Streams
>> 
>> 
>> Harsha  wrote on 21/12/2018 20:17:03:
>> 
>>> From: Harsha 
>>> To: dev@kafka.apache.org
>>> Date: 21/12/2018 20:17
>>> Subject: Re: [VOTE] KIP-382 MirrorMaker 2.0
>>> 
>>> +1 (binding).  Nice work Ryan.
>>> -Harsha
>>> 
>>> On Fri, Dec 21, 2018, at 8:14 AM, Andrew Schofield wrote:
 +1 (non-binding)
 
 Andrew Schofield
 IBM Event Streams
 
 On 21/12/2018, 01:23, "Srinivas Reddy" 
>> wrote:
 
+1 (non binding)
 
Thank you Ryan for the KIP, let me know if you need support in
>>> implementing
it.
 
-
Srinivas
 
- Typed on tiny keys. pls ignore typos.{mobile app}
 
 
On Fri, 21 Dec, 2018, 08:26 Ryanne Dolan > wrote:
 
> Thanks for the votes so far!
> 
> Due to recent discussions, I've removed the high-level REST
>>> API from the
> KIP.
> 
> On Thu, Dec 20, 2018 at 12:42 PM Paul Davidson
>>> 
> wrote:
> 
>> +1
>> 
>> Would be great to see the community build on the basic
>>> approach we took
>> with Mirus. Thanks Ryanne.
>> 
>> On Thu, Dec 20, 2018 at 9:01 AM Andrew Psaltis
>>> > 
>> wrote:
>> 
>>> +1
>>> 
>>> Really looking forward to this and to helping in any way
>>> I can. Thanks
>> for
>>> kicking this off Ryanne.
>>> 
>>> On Thu, Dec 20, 2018 at 10:18 PM Andrew Otto
>> 
> wrote:
>>> 
 +1
 
 This looks like a huge project! Wikimedia would be
>>> very excited to
> have
 this. Thanks!
 
 On Thu, Dec 20, 2018 at 9:52 AM Ryanne Dolan
>>> 
 wrote:
 
> Hey y'all, please vote to adopt KIP-382 by replying +1
>> to this
>> thread.
> 
> For your reference, here are the highlights of the
>> proposal:
> 
> - Leverages the Kafka Connect framework and ecosystem.
> - Includes both source and sink connectors.
> - Includes a high-level driver that manages connectors
>> in a
> dedicated
> cluster.
> - High-level REST API abstracts over connectors
>>> between multiple
>> Kafka
> clusters.
> - Detects new topics, partitions.
> - Automatically syncs topic configuration between
>> clusters.
> - Manages downstream topic ACL.
> - Supports "active/active" cluster pairs, as well as
>>> any number of
>>> active
> clusters.
> - Supports cross-data center replication,
>>> aggregation, and other
>>> complex
> topologies.
> - Provides new metrics including end-to-end
>>> replication latency
>> across
> multiple data centers/clusters.
> - Emits offsets required to migrate consumers
>>> between clusters.
> - Tooling for offset translation.
> - MirrorMaker-compatible legacy mode.
> 
> Thanks, and happy holidays!
> Ryanne
> 
 
>>> 
>> 
>> 
>> --
>> Paul Davidson
>> Principal Engineer, Ajna Team
>> Big Data & Monitoring
>> 
> 
 
 
>>> 
>> 
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with number
>> 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>> 



Re: [DISCUSS] KIP-310: Add a Kafka Source Connector to Kafka Connect

2018-10-25 Thread McCaig, Rhys
Hi All,

Based on the feedback in this thread, and in light of Ryanne’s excellent 
proposal (KIP-382: MirrorMaker 2.0) which incorporates and extends the goals of 
KIP-310, I have updated the status of KIP-310 to “Discarded" and added a 
comment that KIP-382 supersedes it.

Thank you all for the discussion and feedback - this is my first KIP and I 
appreciate the community providing feedback on my contributions!

Rhys

> On Sep 26, 2018, at 10:42 AM, Konstantine Karantasis 
>  wrote:
> 
> Hi Rhys,
> 
> thanks for the proposal and apologies for the late feedback. Utilizing
> Connect to mirror Kafka topics is definitely a plausible proposal for a
> very useful use case.
> 
> However, I don't think the apache/kafka repository is the right place to
> host such a Connector. Currently, no full-featured, production-ready
> connectors are hosted in AK. The only two connectors shipped with AK
> (FileStreamSourceConnector and FileStreamSinkConnector) are there to
> demonstrate implementations only as examples.
> 
> I find this approach very appealing. AK focuses on providing the core
> infrastructure for Connect, that is required in every Kafka Connect
> deployment, as well as offering the means to generically install, deploy
> and operate connectors. But all the connectors reside outside AK and
> comprise a vibrant ecosystem of open source and proprietary components
> that, essentially - even for the most useful and ubiquitous of the
> connectors - are optional for users to install and use. This seems simple
> and flexible, both in terms of releasing and using/deploying software
> related to Kafka Connect. I might even say that I'd be in favor of
> extending this approach to all the Connect components, including
> Transformations and Converters.
> 
> I'm aware that MirrorMaker is part of AK, but to me this refers to the
> early days of Apache Kafka, when the size of the project and the ecosystem
> was smaller, Connect and Streams had not been implemented yet, and
> mirroring topics between Kafka clusters was already a basic need. With a
> much more rich ecosystem now and more sizable and well defined packages in
> AK, I think the approach that decouples connectors from the Connect
> framework itself is a good one.
> 
> In my opinion, the fact that this connector targets Kafka itself as a
> source is not an adequate reason to include it in apache/kafka within the
> Connect framework. It seems it can evolve naturally, as every other
> connector, in its own repository.
> 
> Regards,
> Konstantine
> 
> 
> On Sat, Aug 4, 2018 at 7:20 PM McCaig, Rhys  wrote:
> 
>> Hi All,
>> 
>> If there are no further comments on this KIP I’ll start a vote early this
>> week.
>> 
>> Rhys
>> 
>> On Aug 1, 2018, at 12:32 AM, McCaig, Rhys > <mailto:rhys_mcc...@cable.comcast.com>> wrote:
>> 
>> Hi All,
>> 
>> I’ve updated the proposal to include the improvements suggested by
>> Stephane.
>> 
>> I have also submitted a PR to implement this functionality into Kafka.
>> https://github.com/apache/kafka/pull/5438
>> 
>> I don’t have a benchmark against MirrorMaker yet, as I only currently have
>> a local docker stack available to me, though I have seen very good
>> performance in that test stack (200k messages/sec@100bytes on limited
>> compute resource containers). Further benchmarking might take a few days.
>> 
>> Review and comments would be appreciated.
>> 
>> Cheers,
>> Rhys
>> 
>> 
>> On Jun 18, 2018, at 9:00 AM, McCaig, Rhys > <mailto:rhys_mcc...@cable.comcast.com>> wrote:
>> 
>> Hi Stephane,
>> 
>> Thanks for your feedback and apologies for the delay in my response.
>> 
>> Are there any performance benchmarks against Mirror Maker available? I'm
>> interested to know if this is more performant / scalable.
>> Regarding the implementation, here's some feedback:
>> 
>> 
>> Currently I don’t have any performance benchmarks, but I think this is a
>> great idea, ill see if I can set up something one the next week or so.
>> 
>> - I think it's worth mentioning that this solution does not rely on
>> consumer groups, and therefore tracking progress may be tricky. Can you
>> think of a way to expose that?
>> 
>> This is a reasonable concern. I’m not sure how to track this other than
>> looking at the Kafka connect offsets. Once a messages is passed to the
>> framework, I'm unaware of a way to get at the commit offsets on the
>> producer side. Any thoughts?
>> 
>> - Some code can be in config Validator I believe:
>> 
>> https://github.com/Comcast/MirrorTool-

Re: [DISCUSS] KIP-382: MirrorMaker 2.0

2018-10-16 Thread McCaig, Rhys

> In your example, us-west.us-east.us-central.us-west.topic is an invalid
> "remote topic" name because us-west appears twice. MM2 will not replicate
> us-east.us-central.us-west.topic into us-west a second time, because the
> source topic already has us-west in the prefix. This is what I mean by
> "cycle detection" -- cyclical replication does not result in infinite
> recursion.

Oh - got it, it checks the entire prefix, which seems obvious to me in 
retrospect :)

Rhys


> On Oct 15, 2018, at 3:18 PM, Ryanne Dolan  wrote:
> 
> Rhys, thanks for your enthusiasm!
> 
> In your example, us-west.us-east.us-central.us-west.topic is an invalid
> "remote topic" name because us-west appears twice. MM2 will not replicate
> us-east.us-central.us-west.topic into us-west a second time, because the
> source topic already has us-west in the prefix. This is what I mean by
> "cycle detection" -- cyclical replication does not result in infinite
> recursion.
> 
> It's important to note that MM2 does NOT disallow these sort of cycles, it
> just knows how to deal with them properly.
> 
> Also notice this is done at the topic level, not per record. The records
> don't need any special header or anything for this cycle detection
> mechanism to work.
> 
> Thanks!
> Ryanne
> 
> On Mon, Oct 15, 2018 at 3:40 PM McCaig, Rhys 
> wrote:
> 
>> Hi Ryanne,
>> 
>> This KIP is fantastic. It provides a great vision for how MirrorMaker
>> should evolve in the Kafka project.
>> 
>> I have a question on cycle detection - In a scenario where I have 3
>> clusters replicating between each other, it seems it may be easy to
>> misconfigure the connectors if auto topic creation is turned on so that
>> records become replicated to increasingly longer topic names (until the
>> topic name limit is reached). Consider clusters us-west, us-central,
>> us-east:
>> 
>> us-west: topic
>> us-central: us-west.topic
>> us-east: us-central.us-west.topic
>> us-west: us-east.us-central.us-west.topic
>> us-central: us-west.us-east.us-central.us-west.topic
>> 
>> I’m not sure whether this scenario would actually justify implementing
>> additional measures to avoid such a configuration, rather than ensuring
>> that the documentation is clear on how to avoid such scenarios - would be
>> good to hear what others think on this.
>> 
>> Excited to see the discussion on this one.
>> 
>> Rhys
>> 
>>> On Oct 15, 2018, at 9:16 AM, Ryanne Dolan  wrote:
>>> 
>>> Hey y'all!
>>> 
>>> Please take a look at KIP-382:
>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0
>>> 
>>> Thanks for your feedback and support.
>>> 
>>> Ryanne
>> 
>> 



Re: [DISCUSS] KIP-382: MirrorMaker 2.0

2018-10-15 Thread McCaig, Rhys
Hi Ryanne,

This KIP is fantastic. It provides a great vision for how MirrorMaker should 
evolve in the Kafka project.

I have a question on cycle detection - In a scenario where I have 3 clusters 
replicating between each other, it seems it may be easy to misconfigure the 
connectors if auto topic creation is turned on so that records become 
replicated to increasingly longer topic names (until the topic name limit is 
reached). Consider clusters us-west, us-central, us-east:

us-west: topic
us-central: us-west.topic
us-east: us-central.us-west.topic
us-west: us-east.us-central.us-west.topic
us-central: us-west.us-east.us-central.us-west.topic

I’m not sure whether this scenario would actually justify implementing 
additional measures to avoid such a configuration, rather than ensuring that 
the documentation is clear on how to avoid such scenarios - would be good to 
hear what others think on this.

Excited to see the discussion on this one.

Rhys

> On Oct 15, 2018, at 9:16 AM, Ryanne Dolan  wrote:
> 
> Hey y'all!
> 
> Please take a look at KIP-382:
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0
> 
> Thanks for your feedback and support.
> 
> Ryanne



Re: [EXTERNAL] Incremental Cooperative Rebalancing

2018-10-04 Thread McCaig, Rhys
This is fantastic. Im really excited to see the work on this. 

> On Oct 2, 2018, at 4:22 PM, Konstantine Karantasis  
> wrote:
> 
> Hey everyone,
> 
> I'd like to bring to your attention a general design document that was just
> published in Apache Kafka's wiki space:
> 
> https://cwiki.apache.org/confluence/display/KAFKA/Incremental+Cooperative+Rebalancing%3A+Support+and+Policies
> 
> It deals with the subject of Rebalancing of groups in Kafka and proposes
> basic infrastructure to support improvements on the current rebalancing
> protocol as well as a set of policies that can be implemented to optimize
> rebalancing under a number of real-world scenarios.
> 
> Currently, this wiki page is meant to serve as a reference to the
> proposition of Incremental Cooperative Rebalancing overall. Specific KIPs
> will follow in order to describe in more detail - using the standard KIP
> format - the basic infrastructure and the first policies that will be
> proposed for implementation in components such as Connect, the Kafka
> Consumer and Streams.
> 
> Stay tuned!
> Konstantine



Re: [EXTERNAL] Apache Kafka project charter

2018-10-01 Thread McCaig, Rhys
ent (compatibility across versions of AK broker  & Connect, as well
> as upgrade & API & config compatibility for the connector itself.
> 
> -Ewen
> 
> On Sun, Sep 30, 2018 at 1:44 PM Matthias J. Sax 
> wrote:
> 
>> I am not aware of anything like this. And I also think, it's difficult
>> to generalize. So far, each feature is discussed on a per-case basis.
>> 
>> Because it's hard to draw the boarder line we might be too restrictive
>> or too loose in a "project charter", thus, scaring people from starting
>> KIPs, what would be bad for the community and the project IMHO.
>> 
>> I also think that the overhead of writing a KIP is not too large, and
>> thus the risk (and "wasted time") that a KIP is rejected because "not
>> part of the project" is rather small. Also, anybody could suggest a
>> feature and collect feedback on the mailing list even before a concrete
>> KIP is proposed.
>> 
>> Just my 2 cents.
>> 
>> 
>> -Matthias
>> 
>> 
>> 
>> On 9/29/18 4:31 AM, Jakub Scholz wrote:
>>> Hi community,
>>> 
>>> I noticed following argument in the discussion about KIP-310.
>>> 
>>>> However, I don't think the apache/kafka repository is the right place to
>>> host such a Connector.
>>> 
>>> I was wondering whether there is some project charter describing what
>> does
>>> and what does not belong to the Apache Kafka project. I tried to search
>> for
>>> it, but I haven't found anything.
>>> 
>>> If nothing like that exists, I wonder if we should write something. I
>> think
>>> its not very community friendly to let people write the KIP just to get a
>>> feedback like this. By that I do not mean that the point raised by
>>> Konstantine is necessarily wrong. All I'm trying to say is that I think
>>> there should be some project charter which would describe what does and
>>> doesn't belong into Apache Kafka to make it clear to everyone before
>>> someone starts writing a KIP.
>>> 
>>> WDYT? Does something like that already exist?
>>> 
>>> Thanks & Regards
>>> Jakub
>>> 
>>> On Wed, Sep 26, 2018 at 7:43 PM Konstantine Karantasis <
>>> konstant...@confluent.io> wrote:
>>> 
>>>> Hi Rhys,
>>>> 
>>>> thanks for the proposal and apologies for the late feedback. Utilizing
>>>> Connect to mirror Kafka topics is definitely a plausible proposal for a
>>>> very useful use case.
>>>> 
>>>> However, I don't think the apache/kafka repository is the right place to
>>>> host such a Connector. Currently, no full-featured, production-ready
>>>> connectors are hosted in AK. The only two connectors shipped with AK
>>>> (FileStreamSourceConnector and FileStreamSinkConnector) are there to
>>>> demonstrate implementations only as examples.
>>>> 
>>>> I find this approach very appealing. AK focuses on providing the core
>>>> infrastructure for Connect, that is required in every Kafka Connect
>>>> deployment, as well as offering the means to generically install, deploy
>>>> and operate connectors. But all the connectors reside outside AK and
>>>> comprise a vibrant ecosystem of open source and proprietary components
>>>> that, essentially - even for the most useful and ubiquitous of the
>>>> connectors - are optional for users to install and use. This seems
>> simple
>>>> and flexible, both in terms of releasing and using/deploying software
>>>> related to Kafka Connect. I might even say that I'd be in favor of
>>>> extending this approach to all the Connect components, including
>>>> Transformations and Converters.
>>>> 
>>>> I'm aware that MirrorMaker is part of AK, but to me this refers to the
>>>> early days of Apache Kafka, when the size of the project and the
>> ecosystem
>>>> was smaller, Connect and Streams had not been implemented yet, and
>>>> mirroring topics between Kafka clusters was already a basic need. With a
>>>> much more rich ecosystem now and more sizable and well defined packages
>> in
>>>> AK, I think the approach that decouples connectors from the Connect
>>>> framework itself is a good one.
>>>> 
>>>> In my opinion, the fact that this connector targets Kafka itself as a
>>>> source is not an adequate reason to include it in apache/kafka with

Re: [EXTERNAL] [DISCUSS] KIP-310: Add a Kafka Source Connector to Kafka Connect

2018-09-26 Thread McCaig, Rhys
Hi Konstantine,

Thank you for your thoughtful comments!

> However, I don't think the apache/kafka repository is the right place to
> host such a Connector. 


> I find this approach very appealing. AK focuses on providing the core
> infrastructure for Connect, that is required in every Kafka Connect
> deployment, as well as offering the means to generically install, deploy
> and operate connectors.

I personally flip-flopped on this with similar thoughts with this when I 
initially considered raising a KIP for this functionality. 

When I initially developed a Kafka source connector, this was out of necessity 
- MirrorMaker requires zkconnect strings, which I didn't have access to for the 
source cluster, and Confluent’s proprietary connector also requried zk 
connections - though it has now been updated to remove this limitation. 

While I understand the point of view that MirrorMaker refers to the early days 
of Apache Kafka, it has become a critical tool for replicating data across 
Kafka clusters in for a large portion of the community who are managing Kafka 
at scale. As such, I suspect that there is a lot of interest in the Kafka 
project supporting topic replication across clusters. While one approach (which 
I don’t have the knowledge or time to address) could be to include it as a core 
component of Kafka itself (such as Apache Pulsar’s global topics), my view is 
that at this point in time, Kafka Connect is considered *the* way to ship data 
in and our of a specific Kafka cluster, regardless of the external system. 

I’d welcome further discussion on whether the community thinks is the right 
approach for the Kafka project to take, in regards to handling Kafka topic 
mirroring. I *think* that its important and common enough, that there should be 
support in the project - and MirrorMaker is, as you mention, showing its age. 

Cheers,
Rhys




> On Sep 26, 2018, at 10:42 AM, Konstantine Karantasis 
>  wrote:
> 
> Hi Rhys,
> 
> thanks for the proposal and apologies for the late feedback. Utilizing
> Connect to mirror Kafka topics is definitely a plausible proposal for a
> very useful use case.
> 
> However, I don't think the apache/kafka repository is the right place to
> host such a Connector. Currently, no full-featured, production-ready
> connectors are hosted in AK. The only two connectors shipped with AK
> (FileStreamSourceConnector and FileStreamSinkConnector) are there to
> demonstrate implementations only as examples.
> 
> I find this approach very appealing. AK focuses on providing the core
> infrastructure for Connect, that is required in every Kafka Connect
> deployment, as well as offering the means to generically install, deploy
> and operate connectors. But all the connectors reside outside AK and
> comprise a vibrant ecosystem of open source and proprietary components
> that, essentially - even for the most useful and ubiquitous of the
> connectors - are optional for users to install and use. This seems simple
> and flexible, both in terms of releasing and using/deploying software
> related to Kafka Connect. I might even say that I'd be in favor of
> extending this approach to all the Connect components, including
> Transformations and Converters.
> 
> I'm aware that MirrorMaker is part of AK, but to me this refers to the
> early days of Apache Kafka, when the size of the project and the ecosystem
> was smaller, Connect and Streams had not been implemented yet, and
> mirroring topics between Kafka clusters was already a basic need. With a
> much more rich ecosystem now and more sizable and well defined packages in
> AK, I think the approach that decouples connectors from the Connect
> framework itself is a good one.
> 
> In my opinion, the fact that this connector targets Kafka itself as a
> source is not an adequate reason to include it in apache/kafka within the
> Connect framework. It seems it can evolve naturally, as every other
> connector, in its own repository.
> 
> Regards,
> Konstantine
> 
> 
> On Sat, Aug 4, 2018 at 7:20 PM McCaig, Rhys  wrote:
> 
>> Hi All,
>> 
>> If there are no further comments on this KIP I’ll start a vote early this
>> week.
>> 
>> Rhys
>> 
>> On Aug 1, 2018, at 12:32 AM, McCaig, Rhys > <mailto:rhys_mcc...@cable.comcast.com>> wrote:
>> 
>> Hi All,
>> 
>> I’ve updated the proposal to include the improvements suggested by
>> Stephane.
>> 
>> I have also submitted a PR to implement this functionality into Kafka.
>> https://github.com/apache/kafka/pull/5438
>> 
>> I don’t have a benchmark against MirrorMaker yet, as I only currently have
>> a local docker stack available to me, though I have seen very good
>> performance in that test stack (200k messages/sec@1

Re: [VOTE] KIP-310: Add a Kafka Source Connector to Kafka Connect

2018-09-11 Thread McCaig, Rhys
Hi All,

Bumping this again. 
Can I get feedback from some binding vote holders on this KIP? I think its 
fairly straightforward KIP and worthwhile addition to Kafka Connect.

Cheers,
Rhys



> On Sep 4, 2018, at 12:32 PM, McCaig, Rhys  wrote:
> 
> Bumping this thread. 
> 
>> On Aug 14, 2018, at 11:00 AM, McCaig, Rhys  wrote:
>> 
>> Bumping this thread. Looking for some binding votes or further request for 
>> discussion.
>> 
>>> On Aug 10, 2018, at 12:38 PM, McCaig, Rhys  wrote:
>>> 
>>> Thanks Stephane!
>>> 
>>> If there is a desire for further discussion I am certainly open to 
>>> reverting this to a discussion thread. For now I’ll keep this vote open 
>>> until we get either 3 binding votes or further request for discussion from 
>>> the community.
>>> 
>>> Do you have any additional thoughts on the KIP you’d like to add?
>>> 
>>> Cheers,
>>> Rhys
>>> 
>>>> On Aug 10, 2018, at 2:14 AM, Stephane Maarek 
>>>>  wrote:
>>>> 
>>>> Hi Rhys,
>>>> 
>>>> Overall I'm +1 (non binding), but you're going to need 3 binding votes for
>>>> this KIP to pass.
>>>> I don't feel there has been enough discussion on this from the community.
>>>> Can we get some input from other people?
>>>> 
>>>> Thanks for starting the vote nonetheless :)
>>>> Stephane
>>>> 
>>>> On 8 August 2018 at 20:28, McCaig, Rhys  wrote:
>>>> 
>>>>> Hi
>>>>> 
>>>>> Could we get a couple of votes on this KIP - voting closes in 24 hours.
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Rhys
>>>>> 
>>>>>> On Aug 6, 2018, at 11:51 AM, McCaig, Rhys 
>>>>> wrote:
>>>>>> 
>>>>>> Hi All,
>>>>>> 
>>>>>> I’m starting a vote on KIP-310: Add a Kafka Source Connector to Kafka
>>>>> Connect
>>>>>> 
>>>>>> KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>> 310%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect>>>> ps://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>> 310:+Add+a+Kafka+Source+Connector+to+Kafka+Connect>
>>>>>> Discussion Thread: http://mail-archives.apache.
>>>>> org/mod_mbox/kafka-dev/201808.mbox/%3c17E8D696-E51C-4BEB-
>>>>> bd70-9324d4b53...@comcast.com%3e<http://mail-archives.
>>>>> apache.org/mod_mbox/kafka-dev/201808.mbox/<17E8D696-E51C-
>>>>> 4beb-bd70-9324d4b53...@comcast.com>>
>>>>>> 
>>>>>> Cheers,
>>>>>> Rhys
>>>>> 
>>>>> 
>>> 
>> 
> 



Re: [EXTERNAL] [VOTE] KIP-158: Kafka Connect should allow source connectors to set topic-specific settings for new topics

2018-09-11 Thread McCaig, Rhys
Looks great Randall 
+1 (non-binding)

> On Sep 9, 2018, at 7:17 PM, Gwen Shapira  wrote:
> 
> +1
> Useful improvement, thanks Randall.
> 
> 
> On Fri, Sep 7, 2018, 3:28 PM Randall Hauch  wrote:
> 
>> I believe the feedback on KIP-158 has been addressed. I'd like to start a
>> vote.
>> 
>> KIP:
>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-158%3A+Kafka+Connect+should+allow+source+connectors+to+set+topic-specific+settings+for+new+topics
>> 
>> Discussion Thread:
>> https://www.mail-archive.com/dev@kafka.apache.org/msg73775.html
>> 
>> Thanks!
>> 
>> Randall
>> 



Re: [EXTERNAL] [VOTE] KIP-310: Add a Kafka Source Connector to Kafka Connect

2018-09-04 Thread McCaig, Rhys
Bumping this thread. 

> On Aug 14, 2018, at 11:00 AM, McCaig, Rhys  wrote:
> 
> Bumping this thread. Looking for some binding votes or further request for 
> discussion.
> 
>> On Aug 10, 2018, at 12:38 PM, McCaig, Rhys  wrote:
>> 
>> Thanks Stephane!
>> 
>> If there is a desire for further discussion I am certainly open to reverting 
>> this to a discussion thread. For now I’ll keep this vote open until we get 
>> either 3 binding votes or further request for discussion from the community.
>> 
>> Do you have any additional thoughts on the KIP you’d like to add?
>> 
>> Cheers,
>> Rhys
>> 
>>> On Aug 10, 2018, at 2:14 AM, Stephane Maarek 
>>>  wrote:
>>> 
>>> Hi Rhys,
>>> 
>>> Overall I'm +1 (non binding), but you're going to need 3 binding votes for
>>> this KIP to pass.
>>> I don't feel there has been enough discussion on this from the community.
>>> Can we get some input from other people?
>>> 
>>> Thanks for starting the vote nonetheless :)
>>> Stephane
>>> 
>>> On 8 August 2018 at 20:28, McCaig, Rhys  wrote:
>>> 
>>>> Hi
>>>> 
>>>> Could we get a couple of votes on this KIP - voting closes in 24 hours.
>>>> 
>>>> Thanks,
>>>> 
>>>> Rhys
>>>> 
>>>>> On Aug 6, 2018, at 11:51 AM, McCaig, Rhys 
>>>> wrote:
>>>>> 
>>>>> Hi All,
>>>>> 
>>>>> I’m starting a vote on KIP-310: Add a Kafka Source Connector to Kafka
>>>> Connect
>>>>> 
>>>>> KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>> 310%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect>>> ps://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>> 310:+Add+a+Kafka+Source+Connector+to+Kafka+Connect>
>>>>> Discussion Thread: http://mail-archives.apache.
>>>> org/mod_mbox/kafka-dev/201808.mbox/%3c17E8D696-E51C-4BEB-
>>>> bd70-9324d4b53...@comcast.com%3e<http://mail-archives.
>>>> apache.org/mod_mbox/kafka-dev/201808.mbox/<17E8D696-E51C-
>>>> 4beb-bd70-9324d4b53...@comcast.com>>
>>>>> 
>>>>> Cheers,
>>>>> Rhys
>>>> 
>>>> 
>> 
> 



Re: [EXTERNAL] [DISCUSS] KIP-158: Kafka Connect should allow source connectors to set topic-specific settings for new topics

2018-08-27 Thread McCaig, Rhys
Randall,

This KIP looks great to me. As for _updating_ topic configs - It’s a nice to 
have but certainly something that I could live without in order to get this KIP 
implemented. (Its not something I would use in my current setup but I can see 
some cases where it could be part of the workflow for mirrored topics).
If it were to be included, I’d be happier to see it hidden behind a config flag 
- (if topic already exists, can be an option to WARN/FAIL or change the topic, 
where the default would be warn?)

Cheers,
Rhys

> On Aug 21, 2018, at 10:58 PM, Randall Hauch  wrote:
> 
> Okay, after much delay let's try this again for AK 2.1. Has anyone found
> any concerns? Stephane suggested that we allow updating topic
> configurations (everything but partition count). I'm unconvinced that it's
> worth the additional complexity in the implementation and the documentation
> to explain the behavior. Changing several of the topic-specific
> configurations have significant impact on broker behavior / functionality,
> so IMO we need to proceed more cautiously.
> 
> Stephane, do you have a particular use case in mind for updating topic
> configurations on an existing topic?
> 
> Randall
> 
> 
> On Fri, Jan 26, 2018 at 4:20 PM Randall Hauch  wrote:
> 
>> The KIP deadline for 1.1 has already passed, but I'd like to restart this
>> discussion so that we make the next release. I've not yet addressed the
>> previous comment about *existing* topics, but I'll try to do that over the
>> next few weeks. Any other comments/suggestions/questions?
>> 
>> Best regards,
>> 
>> Randall
>> 
>> On Thu, Oct 5, 2017 at 12:13 AM, Randall Hauch  wrote:
>> 
>>> Oops. Yes, I meant “replication factor”.
>>> 
 On Oct 4, 2017, at 7:18 PM, Ted Yu  wrote:
 
 Randall:
 bq. AdminClient currently allows changing the replication factory.
 
 By 'replication factory' did you mean 'replication factor' ?
 
 Cheers
 
> On Wed, Oct 4, 2017 at 9:58 AM, Randall Hauch 
>>> wrote:
> 
> Currently the KIP's scope is only topics that don't yet exist, and we
>>> have
> to cognizant of race conditions between tasks with the same connector.
>>> I
> think it is worthwhile to consider whether the KIP's scope should
>>> expand to
> also address *existing* partitions, though it may not be appropriate to
> have as much control when changing the topic settings for an existing
> topic. For example, changing the number of partitions (which the KIP
> considers a "topic-specific setting" even though technically it is not)
> shouldn't be done blindly due to the partitioning impacts, and IIRC you
> can't reduce them (which we could verify before applying). Also, I
>>> don't
> think the AdminClient currently allows changing the replication
>>> factory. I
> think changing the topic configs is less problematic both from what
>>> makes
> sense for connectors to verify/change and from what the AdminClient
> supports.
> 
> Even if we decide that it's not appropriate to change the settings on
>>> an
> existing topic, I do think it's advantageous to at least notify the
> connector (or task) prior to the first record sent to a given topic so
>>> that
> the connector can fail or issue a warning if it doesn't meet its
> requirements.
> 
> Best regards,
> 
> Randall
> 
> On Wed, Oct 4, 2017 at 12:52 AM, Stephane Maarek <
> steph...@simplemachines.com.au> wrote:
> 
>> Hi Randall,
>> 
>> Thanks for the KIP. I like it
>> What happens when the target topic is already created but the configs
>>> do
>> not match?
>> i.e. wrong RF, num partitions, or missing / additional configs? Will
>>> you
>> attempt to apply the necessary changes or throw an error?
>> 
>> Thanks!
>> Stephane
>> 
>> 
>> On 24/5/17, 5:59 am, "Mathieu Fenniak" >>> 
>> wrote:
>> 
>>   Ah, yes, I see you a highlighted part that should've made this
>>> clear
>>   to me the first read. :-)  Much clearer now!
>> 
>>   By the way, enjoyed your Debezium talk in NYC.
>> 
>>   Looking forward to this Kafka Connect change; it will allow me to
>>   remove a post-deployment tool that I hacked together for the
>>> purpose
>>   of ensuring auto-created topics have the right config.
>> 
>>   Mathieu
>> 
>> 
>>   On Tue, May 23, 2017 at 11:38 AM, Randall Hauch 
>> wrote:
>>> Thanks for the quick feedback, Mathieu. Yes, the first
> configuration
>> rule
>>> whose regex matches will be applied, and no other rules will be
>> used. I've
>>> updated the KIP to try to make this more clear, but let me know if
>> it's
>>> still not clear.
>>> 
>>> Best regards,
>>> 
>>> Randall
>>> 
>>> On Tue, May 23, 2017 at 10:07 AM, Mathieu Fenniak <
>>> mathieu.fenn...@replicon.com> wrote:
>>> 
 Hi Randall,
 

Re: [EXTERNAL] [VOTE] KIP-310: Add a Kafka Source Connector to Kafka Connect

2018-08-14 Thread McCaig, Rhys
Bumping this thread. Looking for some binding votes or further request for 
discussion.

> On Aug 10, 2018, at 12:38 PM, McCaig, Rhys  wrote:
> 
> Thanks Stephane!
> 
> If there is a desire for further discussion I am certainly open to reverting 
> this to a discussion thread. For now I’ll keep this vote open until we get 
> either 3 binding votes or further request for discussion from the community.
> 
> Do you have any additional thoughts on the KIP you’d like to add?
> 
> Cheers,
> Rhys
> 
>> On Aug 10, 2018, at 2:14 AM, Stephane Maarek 
>>  wrote:
>> 
>> Hi Rhys,
>> 
>> Overall I'm +1 (non binding), but you're going to need 3 binding votes for
>> this KIP to pass.
>> I don't feel there has been enough discussion on this from the community.
>> Can we get some input from other people?
>> 
>> Thanks for starting the vote nonetheless :)
>> Stephane
>> 
>> On 8 August 2018 at 20:28, McCaig, Rhys  wrote:
>> 
>>> Hi
>>> 
>>> Could we get a couple of votes on this KIP - voting closes in 24 hours.
>>> 
>>> Thanks,
>>> 
>>> Rhys
>>> 
>>>> On Aug 6, 2018, at 11:51 AM, McCaig, Rhys 
>>> wrote:
>>>> 
>>>> Hi All,
>>>> 
>>>> I’m starting a vote on KIP-310: Add a Kafka Source Connector to Kafka
>>> Connect
>>>> 
>>>> KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>> 310%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect>> ps://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>> 310:+Add+a+Kafka+Source+Connector+to+Kafka+Connect>
>>>> Discussion Thread: http://mail-archives.apache.
>>> org/mod_mbox/kafka-dev/201808.mbox/%3c17E8D696-E51C-4BEB-
>>> bd70-9324d4b53...@comcast.com%3e<http://mail-archives.
>>> apache.org/mod_mbox/kafka-dev/201808.mbox/<17E8D696-E51C-
>>> 4beb-bd70-9324d4b53...@comcast.com>>
>>>> 
>>>> Cheers,
>>>> Rhys
>>> 
>>> 
> 



Re: [EXTERNAL] [VOTE] KIP-310: Add a Kafka Source Connector to Kafka Connect

2018-08-10 Thread McCaig, Rhys
Thanks Stephane!

If there is a desire for further discussion I am certainly open to reverting 
this to a discussion thread. For now I’ll keep this vote open until we get 
either 3 binding votes or further request for discussion from the community.

Do you have any additional thoughts on the KIP you’d like to add?

Cheers,
Rhys

> On Aug 10, 2018, at 2:14 AM, Stephane Maarek  
> wrote:
> 
> Hi Rhys,
> 
> Overall I'm +1 (non binding), but you're going to need 3 binding votes for
> this KIP to pass.
> I don't feel there has been enough discussion on this from the community.
> Can we get some input from other people?
> 
> Thanks for starting the vote nonetheless :)
> Stephane
> 
> On 8 August 2018 at 20:28, McCaig, Rhys  wrote:
> 
>> Hi
>> 
>> Could we get a couple of votes on this KIP - voting closes in 24 hours.
>> 
>> Thanks,
>> 
>> Rhys
>> 
>>> On Aug 6, 2018, at 11:51 AM, McCaig, Rhys 
>> wrote:
>>> 
>>> Hi All,
>>> 
>>> I’m starting a vote on KIP-310: Add a Kafka Source Connector to Kafka
>> Connect
>>> 
>>> KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 310%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect> ps://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 310:+Add+a+Kafka+Source+Connector+to+Kafka+Connect>
>>> Discussion Thread: http://mail-archives.apache.
>> org/mod_mbox/kafka-dev/201808.mbox/%3c17E8D696-E51C-4BEB-
>> bd70-9324d4b53...@comcast.com%3e<http://mail-archives.
>> apache.org/mod_mbox/kafka-dev/201808.mbox/<17E8D696-E51C-
>> 4beb-bd70-9324d4b53...@comcast.com>>
>>> 
>>> Cheers,
>>> Rhys
>> 
>> 



Re: [EXTERNAL] [VOTE] KIP-310: Add a Kafka Source Connector to Kafka Connect

2018-08-08 Thread McCaig, Rhys
Hi

Could we get a couple of votes on this KIP - voting closes in 24 hours.

Thanks,

Rhys

> On Aug 6, 2018, at 11:51 AM, McCaig, Rhys  wrote:
> 
> Hi All,
> 
> I’m starting a vote on KIP-310: Add a Kafka Source Connector to Kafka Connect
> 
> KIP: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-310%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect<https://cwiki.apache.org/confluence/display/KAFKA/KIP-310:+Add+a+Kafka+Source+Connector+to+Kafka+Connect>
> Discussion Thread: 
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201808.mbox/%3c17e8d696-e51c-4beb-bd70-9324d4b53...@comcast.com%3e<http://mail-archives.apache.org/mod_mbox/kafka-dev/201808.mbox/<17e8d696-e51c-4beb-bd70-9324d4b53...@comcast.com>>
> 
> Cheers,
> Rhys



[VOTE] KIP-310: Add a Kafka Source Connector to Kafka Connect

2018-08-06 Thread McCaig, Rhys
Hi All,

I’m starting a vote on KIP-310: Add a Kafka Source Connector to Kafka Connect

KIP: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-310%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect
Discussion Thread: 
http://mail-archives.apache.org/mod_mbox/kafka-dev/201808.mbox/%3c17e8d696-e51c-4beb-bd70-9324d4b53...@comcast.com%3e>

Cheers,
Rhys


Re: [EXTERNAL] [DISCUSS] KIP-310: Add a Kafka Source Connector to Kafka Connect

2018-08-04 Thread McCaig, Rhys
Hi All,

If there are no further comments on this KIP I’ll start a vote early this week.

Rhys

On Aug 1, 2018, at 12:32 AM, McCaig, Rhys 
mailto:rhys_mcc...@cable.comcast.com>> wrote:

Hi All,

I’ve updated the proposal to include the improvements suggested by Stephane.

I have also submitted a PR to implement this functionality into Kafka. 
https://github.com/apache/kafka/pull/5438

I don’t have a benchmark against MirrorMaker yet, as I only currently have a 
local docker stack available to me, though I have seen very good performance in 
that test stack (200k messages/sec@100bytes on limited compute resource 
containers). Further benchmarking might take a few days.

Review and comments would be appreciated.

Cheers,
Rhys


On Jun 18, 2018, at 9:00 AM, McCaig, Rhys 
mailto:rhys_mcc...@cable.comcast.com>> wrote:

Hi Stephane,

Thanks for your feedback and apologies for the delay in my response.

Are there any performance benchmarks against Mirror Maker available? I'm
interested to know if this is more performant / scalable.
Regarding the implementation, here's some feedback:


Currently I don’t have any performance benchmarks, but I think this is a great 
idea, ill see if I can set up something one the next week or so.

- I think it's worth mentioning that this solution does not rely on
consumer groups, and therefore tracking progress may be tricky. Can you
think of a way to expose that?

This is a reasonable concern. I’m not sure how to track this other than looking 
at the Kafka connect offsets. Once a messages is passed to the framework, I'm 
unaware of a way to get at the commit offsets on the producer side. Any 
thoughts?

- Some code can be in config Validator I believe:
https://github.com/Comcast/MirrorTool-for-Kafka-Connect/blob/master/src/main/java/com/comcast/kafka/connect/kafka/KafkaSourceConnector.java#L47

- I think your kip mentions `source.admin.` and `source.consumer.` but I
don't see it reflected yet in the code

- Is there a way to be flexible and merge list and regex, or offer the two
simultaneously ? source_topics=my_static_topic,prefix.* ?

Agree on all of the above - I will incorporate into the code later this week as 
ill get some time back to work on this.

Cheers,
Rhys



On Jun 6, 2018, at 7:16 PM, Stephane Maarek 
mailto:steph...@simplemachines.com.au>> wrote:

Hi Rhys,

I think this will be a great addition.

Are there any performance benchmarks against Mirror Maker available? I'm
interested to know if this is more performant / scalable.
Regarding the implementation, here's some feedback:

- I think it's worth mentioning that this solution does not rely on
consumer groups, and therefore tracking progress may be tricky. Can you
think of a way to expose that?


- Some code can be in config Validator I believe:
https://github.com/Comcast/MirrorTool-for-Kafka-Connect/blob/master/src/main/java/com/comcast/kafka/connect/kafka/KafkaSourceConnector.java#L47

- I think your kip mentions `source.admin.` and `source.consumer.` but I
don't see it reflected yet in the code

- Is there a way to be flexible and merge list and regex, or offer the two
simultaneously ? source_topics=my_static_topic,prefix.* ?

Hope that helps
Stephane

Kind regards,
Stephane

[image: Simple Machines]

Stephane Maarek | Developer

+61 416 575 980
steph...@simplemachines.com.au<mailto:steph...@simplemachines.com.au>
simplemachines.com.au<http://simplemachines.com.au>
Level 2, 145 William Street, Sydney NSW 2010

On 5 June 2018 at 09:04, McCaig, Rhys 
mailto:rhys_mcc...@comcast.com>> wrote:

Hi All,

As I didn’t get any comment on this KIP and there has since been an
additional 2 KIP’s created numbered 308 since, I'm bumping this and
renaming the KIP to 310 to remove the duplication:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-
310%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect

Let me know if you have any comments or feedback, would love to hear them.

Cheers,
Rhys

On May 28, 2018, at 10:23 PM, McCaig, Rhys 
mailto:rhys_mcc...@comcast.com>>
wrote:

Sorry for the bad link to the KIP, here it is: https://cwiki.apache.org/
confluence/display/KAFKA/KIP-308%3A+Add+a+Kafka+Source+
Connector+to+Kafka+Connect

On May 28, 2018, at 10:19 PM, McCaig, Rhys 
mailto:rhys_mcc...@comcast.com>>
wrote:

Hi All,

I added a KIP to include a Kafka Source Connector with Kafka Connect.
Here is the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
308%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect

Looking forward to your feedback and suggestions.

Cheers,
Rhys










Re: [EXTERNAL] [DISCUSS] KIP-310: Add a Kafka Source Connector to Kafka Connect

2018-08-01 Thread McCaig, Rhys
Hi All,

I’ve updated the proposal to include the improvements suggested by Stephane.

I have also submitted a PR to implement this functionality into Kafka. 
https://github.com/apache/kafka/pull/5438

I don’t have a benchmark against MirrorMaker yet, as I only currently have a 
local docker stack available to me, though I have seen very good performance in 
that test stack (200k messages/sec@100bytes on limited compute resource 
containers). Further benchmarking might take a few days.

Review and comments would be appreciated.

Cheers,
Rhys


On Jun 18, 2018, at 9:00 AM, McCaig, Rhys 
mailto:rhys_mcc...@cable.comcast.com>> wrote:

Hi Stephane,

Thanks for your feedback and apologies for the delay in my response.

Are there any performance benchmarks against Mirror Maker available? I'm
interested to know if this is more performant / scalable.
Regarding the implementation, here's some feedback:


Currently I don’t have any performance benchmarks, but I think this is a great 
idea, ill see if I can set up something one the next week or so.

- I think it's worth mentioning that this solution does not rely on
consumer groups, and therefore tracking progress may be tricky. Can you
think of a way to expose that?

This is a reasonable concern. I’m not sure how to track this other than looking 
at the Kafka connect offsets. Once a messages is passed to the framework, I'm 
unaware of a way to get at the commit offsets on the producer side. Any 
thoughts?

- Some code can be in config Validator I believe:
https://github.com/Comcast/MirrorTool-for-Kafka-Connect/blob/master/src/main/java/com/comcast/kafka/connect/kafka/KafkaSourceConnector.java#L47

- I think your kip mentions `source.admin.` and `source.consumer.` but I
don't see it reflected yet in the code

- Is there a way to be flexible and merge list and regex, or offer the two
simultaneously ? source_topics=my_static_topic,prefix.* ?

Agree on all of the above - I will incorporate into the code later this week as 
ill get some time back to work on this.

Cheers,
Rhys



On Jun 6, 2018, at 7:16 PM, Stephane Maarek 
mailto:steph...@simplemachines.com.au>> wrote:

Hi Rhys,

I think this will be a great addition.

Are there any performance benchmarks against Mirror Maker available? I'm
interested to know if this is more performant / scalable.
Regarding the implementation, here's some feedback:

- I think it's worth mentioning that this solution does not rely on
consumer groups, and therefore tracking progress may be tricky. Can you
think of a way to expose that?


- Some code can be in config Validator I believe:
https://github.com/Comcast/MirrorTool-for-Kafka-Connect/blob/master/src/main/java/com/comcast/kafka/connect/kafka/KafkaSourceConnector.java#L47

- I think your kip mentions `source.admin.` and `source.consumer.` but I
don't see it reflected yet in the code

- Is there a way to be flexible and merge list and regex, or offer the two
simultaneously ? source_topics=my_static_topic,prefix.* ?

Hope that helps
Stephane

Kind regards,
Stephane

[image: Simple Machines]

Stephane Maarek | Developer

+61 416 575 980
steph...@simplemachines.com.au
simplemachines.com.au
Level 2, 145 William Street, Sydney NSW 2010

On 5 June 2018 at 09:04, McCaig, Rhys  wrote:

Hi All,

As I didn’t get any comment on this KIP and there has since been an
additional 2 KIP’s created numbered 308 since, I'm bumping this and
renaming the KIP to 310 to remove the duplication:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-
310%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect

Let me know if you have any comments or feedback, would love to hear them.

Cheers,
Rhys

On May 28, 2018, at 10:23 PM, McCaig, Rhys 
wrote:

Sorry for the bad link to the KIP, here it is: https://cwiki.apache.org/
confluence/display/KAFKA/KIP-308%3A+Add+a+Kafka+Source+
Connector+to+Kafka+Connect

On May 28, 2018, at 10:19 PM, McCaig, Rhys 
wrote:

Hi All,

I added a KIP to include a Kafka Source Connector with Kafka Connect.
Here is the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
308%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect

Looking forward to your feedback and suggestions.

Cheers,
Rhys









Re: [EXTERNAL] [DISCUSS] KIP-310: Add a Kafka Source Connector to Kafka Connect

2018-06-18 Thread McCaig, Rhys
Hi Stephane,

Thanks for your feedback and apologies for the delay in my response. 

> Are there any performance benchmarks against Mirror Maker available? I'm
> interested to know if this is more performant / scalable.
> Regarding the implementation, here's some feedback:


Currently I don’t have any performance benchmarks, but I think this is a great 
idea, ill see if I can set up something one the next week or so. 

> - I think it's worth mentioning that this solution does not rely on
> consumer groups, and therefore tracking progress may be tricky. Can you
> think of a way to expose that?

This is a reasonable concern. I’m not sure how to track this other than looking 
at the Kafka connect offsets. Once a messages is passed to the framework, I'm 
unaware of a way to get at the commit offsets on the producer side. Any 
thoughts?

> - Some code can be in config Validator I believe:
> https://github.com/Comcast/MirrorTool-for-Kafka-Connect/blob/master/src/main/java/com/comcast/kafka/connect/kafka/KafkaSourceConnector.java#L47
> 
> - I think your kip mentions `source.admin.` and `source.consumer.` but I
> don't see it reflected yet in the code
> 
> - Is there a way to be flexible and merge list and regex, or offer the two
> simultaneously ? source_topics=my_static_topic,prefix.* ?

Agree on all of the above - I will incorporate into the code later this week as 
ill get some time back to work on this.

Cheers,
Rhys



> On Jun 6, 2018, at 7:16 PM, Stephane Maarek  
> wrote:
> 
> Hi Rhys,
> 
> I think this will be a great addition.
> 
> Are there any performance benchmarks against Mirror Maker available? I'm
> interested to know if this is more performant / scalable.
> Regarding the implementation, here's some feedback:
> 
> - I think it's worth mentioning that this solution does not rely on
> consumer groups, and therefore tracking progress may be tricky. Can you
> think of a way to expose that?
> 

> - Some code can be in config Validator I believe:
> https://github.com/Comcast/MirrorTool-for-Kafka-Connect/blob/master/src/main/java/com/comcast/kafka/connect/kafka/KafkaSourceConnector.java#L47
> 
> - I think your kip mentions `source.admin.` and `source.consumer.` but I
> don't see it reflected yet in the code
> 
> - Is there a way to be flexible and merge list and regex, or offer the two
> simultaneously ? source_topics=my_static_topic,prefix.* ?
> 
> Hope that helps
> Stephane
> 
> Kind regards,
> Stephane
> 
> [image: Simple Machines]
> 
> Stephane Maarek | Developer
> 
> +61 416 575 980
> steph...@simplemachines.com.au
> simplemachines.com.au
> Level 2, 145 William Street, Sydney NSW 2010
> 
> On 5 June 2018 at 09:04, McCaig, Rhys  wrote:
> 
>> Hi All,
>> 
>> As I didn’t get any comment on this KIP and there has since been an
>> additional 2 KIP’s created numbered 308 since, I'm bumping this and
>> renaming the KIP to 310 to remove the duplication:
>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 310%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect
>> 
>> Let me know if you have any comments or feedback, would love to hear them.
>> 
>> Cheers,
>> Rhys
>> 
>>> On May 28, 2018, at 10:23 PM, McCaig, Rhys 
>> wrote:
>>> 
>>> Sorry for the bad link to the KIP, here it is: https://cwiki.apache.org/
>> confluence/display/KAFKA/KIP-308%3A+Add+a+Kafka+Source+
>> Connector+to+Kafka+Connect
>>> 
>>>> On May 28, 2018, at 10:19 PM, McCaig, Rhys 
>> wrote:
>>>> 
>>>> Hi All,
>>>> 
>>>> I added a KIP to include a Kafka Source Connector with Kafka Connect.
>>>> Here is the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 308%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect> ps://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 308:+Add+a+Kafka+Source+Connector+to+Kafka+Connect>
>>>> 
>>>> Looking forward to your feedback and suggestions.
>>>> 
>>>> Cheers,
>>>> Rhys
>>>> 
>>>> 
>>> 
>> 
>> 



[DISCUSS] KIP-310: Add a Kafka Source Connector to Kafka Connect

2018-06-04 Thread McCaig, Rhys
Hi All,

As I didn’t get any comment on this KIP and there has since been an additional 
2 KIP’s created numbered 308 since, I'm bumping this and renaming the KIP to 
310 to remove the duplication:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-310%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect

Let me know if you have any comments or feedback, would love to hear them.

Cheers,
Rhys

> On May 28, 2018, at 10:23 PM, McCaig, Rhys  wrote:
> 
> Sorry for the bad link to the KIP, here it is: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-308%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect
> 
>> On May 28, 2018, at 10:19 PM, McCaig, Rhys  wrote:
>> 
>> Hi All,
>> 
>> I added a KIP to include a Kafka Source Connector with Kafka Connect.
>> Here is the KIP: 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-308%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect<https://cwiki.apache.org/confluence/display/KAFKA/KIP-308:+Add+a+Kafka+Source+Connector+to+Kafka+Connect>
>> 
>> Looking forward to your feedback and suggestions.
>> 
>> Cheers,
>> Rhys
>> 
>> 
> 



Re: [EXTERNAL] [DISCUSS] KIP-308: Add a Kafka Source Connector to Kafka Connect

2018-05-28 Thread McCaig, Rhys
Sorry for the bad link to the KIP, here it is: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-308%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect

> On May 28, 2018, at 10:19 PM, McCaig, Rhys  wrote:
> 
> Hi All,
> 
> I added a KIP to include a Kafka Source Connector with Kafka Connect.
> Here is the KIP: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-308%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect<https://cwiki.apache.org/confluence/display/KAFKA/KIP-308:+Add+a+Kafka+Source+Connector+to+Kafka+Connect>
> 
> Looking forward to your feedback and suggestions.
> 
> Cheers,
> Rhys
> 
> 



[DISCUSS] KIP-308: Add a Kafka Source Connector to Kafka Connect

2018-05-28 Thread McCaig, Rhys
Hi All,

I added a KIP to include a Kafka Source Connector with Kafka Connect.
Here is the KIP: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-308%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect

Looking forward to your feedback and suggestions.

Cheers,
Rhys




Re: [EXTERNAL] Kafka Connect: New Kafka Source Connector

2018-05-15 Thread McCaig, Rhys
Hi Team,

Would someone be able to provide me with Confluence permission in order to 
write a KIP for the below code.

User: https://cwiki.apache.org/confluence/display/~mccaig

Cheers,
Rhys

On May 11, 2018, at 4:45 PM, McCaig, Rhys 
<rhys_mcc...@comcast.com<mailto:rhys_mcc...@comcast.com>> wrote:

Hi there,

Over at Comcast we just open sourced a Kafka source connector for Kafka 
Connect. (https://github.com/Comcast/MirrorTool-for-Kafka-Connect) We’ve used 
this as an alternative to MirrorMaker on a couple of projects.
While discussing open sourcing the project, we realized that the functionality 
is similar to a connector that was suggested in the original Kafka Connect KIP 
(https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=58851767#KIP-26-AddKafkaConnectframeworkfordataimport/export-MirrorMaker).

Given this - we we’re wondering if there would be interest from the Kafka 
community in adopting the connector into the main Kafka codebase. We’d be more 
than happy to donate the code and help get it integrated.

Cheers,
Rhys McCaig



Kafka Connect: New Kafka Source Connector

2018-05-11 Thread McCaig, Rhys
Hi there,

Over at Comcast we just open sourced a Kafka source connector for Kafka 
Connect. (https://github.com/Comcast/MirrorTool-for-Kafka-Connect) We’ve used 
this as an alternative to MirrorMaker on a couple of projects.
While discussing open sourcing the project, we realized that the functionality 
is similar to a connector that was suggested in the original Kafka Connect KIP 
(https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=58851767#KIP-26-AddKafkaConnectframeworkfordataimport/export-MirrorMaker).

Given this - we we’re wondering if there would be interest from the Kafka 
community in adopting the connector into the main Kafka codebase. We’d be more 
than happy to donate the code and help get it integrated.

Cheers,
Rhys McCaig