Re: [VOTE] 2.0.1 RC0

2018-11-02 Thread Ewen Cheslack-Postava
+1

-Ewen

On Thu, Nov 1, 2018 at 10:10 AM Manikumar  wrote:

> We were waiting for the system test results. There were few failures:
> KAFKA-7579,  KAFKA-7559, KAFKA-7561
> they are not blockers for 2.0.1 release. We need more votes from
> PMC/committers :)
>
> Thanks Stanislav! for the system test results.
>
> Thanks,
> Manikumar
>
> On Thu, Nov 1, 2018 at 10:20 PM Eno Thereska 
> wrote:
>
> > Anything else holding this up?
> >
> > Thanks
> > Eno
> >
> > On Thu, Nov 1, 2018 at 10:27 AM Jakub Scholz  wrote:
> >
> > > +1 (non-binding) ... I used the staged binaries and run tests with
> > > different clients.
> > >
> > > On Fri, Oct 26, 2018 at 4:29 AM Manikumar 
> > > wrote:
> > >
> > > > Hello Kafka users, developers and client-developers,
> > > >
> > > > This is the first candidate for release of Apache Kafka 2.0.1.
> > > >
> > > > This is a bug fix release closing 49 tickets:
> > > > https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+2.0.1
> > > >
> > > > Release notes for the 2.0.1 release:
> > > > http://home.apache.org/~manikumar/kafka-2.0.1-rc0/RELEASE_NOTES.html
> > > >
> > > > *** Please download, test and vote by  Tuesday, October 30, end of
> day
> > > >
> > > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > > > http://kafka.apache.org/KEYS
> > > >
> > > > * Release artifacts to be voted upon (source and binary):
> > > > http://home.apache.org/~manikumar/kafka-2.0.1-rc0/
> > > >
> > > > * Maven artifacts to be voted upon:
> > > > https://repository.apache.org/content/groups/staging/
> > > >
> > > > * Javadoc:
> > > > http://home.apache.org/~manikumar/kafka-2.0.1-rc0/javadoc/
> > > >
> > > > * Tag to be voted upon (off 2.0 branch) is the 2.0.1 tag:
> > > > https://github.com/apache/kafka/releases/tag/2.0.1-rc0
> > > >
> > > > * Documentation:
> > > > http://kafka.apache.org/20/documentation.html
> > > >
> > > > * Protocol:
> > > > http://kafka.apache.org/20/protocol.html
> > > >
> > > > * Successful Jenkins builds for the 2.0 branch:
> > > > Unit/integration tests:
> > > https://builds.apache.org/job/kafka-2.0-jdk8/177/
> > > >
> > > > /**
> > > >
> > > > Thanks,
> > > > Manikumar
> > > >
> > >
> >
>


Re: Kafka Connect task re-balance repeatedly

2018-03-22 Thread Ewen Cheslack-Postava
The log is showing that the Connect worker is trying to make sure it has
read the entire log and gets to offset 119, but some other worker says it
has read to offset 169. The two are in inconsistent states, so the one that
seems to be behind will not start work with potentially outdated
configuration info.

Since it is logging that it *finished* reading to the end of the log, the
steps to check the offset and validate we've read those offsets have been
met. That seems to indicate the two workers aren't looking at the same data.

One case I'm aware of where this can happen is if you have two connect
workers that are configured to be in the same Connect worker group (their
worker configs have the same group.id) but they are actually looking at
different config topics (config.storage.topic is different for the two
workers). Sometimes people make this mistake when they start running
multiple Connect clusters but forget to make some of the default settings
unique. I'd probably start by looking into that possibility to debug this
issue.

-Ewen

On Wed, Mar 21, 2018 at 10:15 PM, Ziliang Chen  wrote:

> Hi,
>
> I have 2 Kafka Connect instances runs in 2 boxes which forms a Kafka
> Connect Cluster. One of the instance seems doing the re-balance repeatedly
> in a dead loop without running the actual Sink task, the other works fine.
> The following is the output message in the console.
>
> May I ask if you have ever encountered similar issue before ?
>
> Thank you so much !
>
>
> [2018-03-21 14:53:15,671] WARN Catching up to assignment's config offset.
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:762)
> > [2018-03-21 14:53:15,671] INFO Current config state offset 119 is behind
> > group assignment 169, reading to end of config log
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:807)
> > [2018-03-21 14:53:16,046] INFO Finished reading to end of log and updated
> > config snapshot, new config log offset: 119
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:811)
> > [2018-03-21 14:53:16,046] INFO Current config state offset 119 does not
> > match group assignment 169. Forcing rebalance.
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:786)
> > [2018-03-21 14:53:16,046] INFO Rebalance started
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1214)
> > [2018-03-21 14:53:16,046] INFO Wasn't unable to resume work after last
> > rebalance, can skip stopping connectors and tasks
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1246)
> > [2018-03-21 14:53:16,046] INFO [Worker clientId=connect-1,
> > groupId=kafka-connect-NA-sink] (Re-)joining group
> > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:336)
> > [2018-03-21 14:53:16,172] INFO [Worker clientId=connect-1,
> > groupId=kafka-connect-NA-sink] Successfully joined group with generation
> 14
> > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:341)
> > [2018-03-21 14:53:16,173] INFO Joined group and got assignment:
> > Assignment{error=0,
> > leader='connect-1-3b12c02b-070e-431b-9ec9-924cffd2b6dc', leaderUrl='
> > http://1.1.1.1:8083/', offset=169, connectorIds=[Send_To_NA],
> >  taskIds=[Send_To_NA-1, Send_To_NA-3, Send_To_NA-5, Send_To_NA-7,
> > Send_To_NA-9, Send_To_NA-11, Send_To_NA-13, Send_To_NA-15, Send_To_NA-17,
> > Send_To_NA-19, Send_To_NA-21]}
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1192
>
>
>
> --
> Regards, Zi-Liang
>
> Mail:zlchen@gmail.com
>


[ANNOUNCE] Apache Kafka 1.0.1 Released

2018-03-06 Thread Ewen Cheslack-Postava
The Apache Kafka community is pleased to announce the release for Apache Kafka
1.0.1.

This is a bugfix release for the 1.0 branch that was first released with 1.0.0 
about 4 months ago. We've fixed 49 issues since that release. Most of these are 
non-critical, but in aggregate these fixes will have significant impact. A few 
of the more significant fixes include:

* KAFKA-6277: Make loadClass thread-safe for class loaders of Connect plugins
* KAFKA-6185: Selector memory leak with high likelihood of OOM in case of down 
conversion
* KAFKA-6269: KTable state restore fails after rebalance
* KAFKA-6190: GlobalKTable never finishes restoring when consuming 
transactional messages
* KAFKA-6529: Stop file descriptor leak when client disconnects with staged 
receives
* KAFKA-6238: Issues with protocol version when applying a rolling upgrade to 
1.0.0


All of the changes in this release can be found in the release notes:


https://dist.apache.org/repos/dist/release/kafka/1.0.1/RELEASE_NOTES.html



You can download the source release from:


https://www.apache.org/dyn/closer.cgi?path=/kafka/1.0.1/kafka-1.0.1-src.tgz


and binary releases from:


https://www.apache.org/dyn/closer.cgi?path=/kafka/1.0.1/kafka_2.11-1.0.1.tgz
(Scala 2.11)

https://www.apache.org/dyn/closer.cgi?path=/kafka/1.0.1/kafka_2.12-1.0.1.tgz
(Scala 2.12)

---


Apache Kafka is a distributed streaming platform with four core APIs:


** The Producer API allows an application to publish a stream records to
one or more Kafka topics.


** The Consumer API allows an application to subscribe to one or more
topics and process the stream of records produced to them.


** The Streams API allows an application to act as a stream processor,
consuming an input stream from one or more topics and producing an output
stream to one or more output topics, effectively transforming the input
streams to output streams.


** The Connector API allows building and running reusable producers or
consumers that connect Kafka topics to existing applications or data
systems. For example, a connector to a relational database might capture
every change to a table.three key capabilities:



With these APIs, Kafka can be used for two broad classes of application:


** Building real-time streaming data pipelines that reliably get data
between systems or applications.


** Building real-time streaming applications that transform or react to the
streams of data.



Apache Kafka is in use at large and small companies worldwide, including 
Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank, 
Target, The New York Times, Uber, Yelp, and Zalando, among others.



A big thank you for the following 36 contributors to this release!

Alex Good, Andras Beni, Andy Bryant, Arjun Satish, Bill Bejeck, Colin P. 
Mccabe, Colin Patrick McCabe, ConcurrencyPractitioner, Damian Guy, Daniel 
Wojda, Dong Lin, Edoardo Comar, Ewen Cheslack-Postava, Filipe Agapito, fredfp, 
Guozhang Wang, huxihx, Ismael Juma, Jason Gustafson, Jeremy Custenborder, 
Jiangjie (Becket) Qin, Joel Hamill, Konstantine Karantasis, lisa2lisa, Logan 
Buckley, Manjula K, Matthias J. Sax, Nick Chiu, parafiend, Rajini Sivaram, 
Randall Hauch, Robert Yokota, Ron Dagostino, tedyu, Yaswanth Kumar, Yu.


We welcome your help and feedback. For more information on how to
report problems,
and to get involved, visit the project website at http://kafka.apache.org/


Thank you!
Ewen


[VOTE] 1.0.1 RC2

2018-02-21 Thread Ewen Cheslack-Postava
Hello Kafka users, developers and client-developers,

This is the third candidate for release of Apache Kafka 1.0.1.

This is a bugfix release for the 1.0 branch that was first released with
1.0.0 about 3 months ago. We've fixed 49 issues since that release. Most of
these are non-critical, but in aggregate these fixes will have significant
impact. A few of the more significant fixes include:

* KAFKA-6277: Make loadClass thread-safe for class loaders of Connect
plugins
* KAFKA-6185: Selector memory leak with high likelihood of OOM in case of
down conversion
* KAFKA-6269: KTable state restore fails after rebalance
* KAFKA-6190: GlobalKTable never finishes restoring when consuming
transactional messages
* KAFKA-6529: Stop file descriptor leak when client disconnects with staged
receives
* KAFKA-6238: Issues with protocol version when applying a rolling upgrade
to 1.0.0

Release notes for the 1.0.1 release:
http://home.apache.org/~ewencp/kafka-1.0.1-rc2/RELEASE_NOTES.html

*** Please download, test and vote by Saturday Feb 24, 9pm PT ***

Kafka's KEYS file containing PGP keys we use to sign the release:
http://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
http://home.apache.org/~ewencp/kafka-1.0.1-rc2/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/

* Javadoc:
http://home.apache.org/~ewencp/kafka-1.0.1-rc2/javadoc/

* Tag to be voted upon (off 1.0 branch) is the 1.0.1 tag:
https://github.com/apache/kafka/tree/1.0.1-rc2

* Documentation:
http://kafka.apache.org/10/documentation.html

* Protocol:
http://kafka.apache.org/10/protocol.html

/**

Thanks,
Ewen Cheslack-Postava


Re: [VOTE] 1.0.1 RC1

2018-02-20 Thread Ewen Cheslack-Postava
Just a heads up that an issue with upgrades was found
https://issues.apache.org/jira/browse/KAFKA-6238 There's a PR in progress
to address the underlying issue. There was a workaround, but it looked like
this would cause more pain than doing one more RC, so we'll have an RC2 up
soon after the PR is merged.

-Ewen

On Fri, Feb 16, 2018 at 3:53 AM, Mickael Maison <mickael.mai...@gmail.com>
wrote:

> Ran tests from source and quickstart with binaries
>
> +1 (non-binding)
>
> On Fri, Feb 16, 2018 at 6:05 AM, Jason Gustafson <ja...@confluent.io>
> wrote:
> > +1. Verified artifacts and ran quickstart. Thanks Ewen!
> >
> > -Jason
> >
> > On Thu, Feb 15, 2018 at 1:42 AM, Rajini Sivaram <rajinisiva...@gmail.com
> >
> > wrote:
> >
> >> +1
> >>
> >> Ran quickstart with binaries, built source and ran tests,
> >>
> >> Thank you for running the release, Ewen.
> >>
> >> Regards,
> >>
> >> Rajini
> >>
> >> On Thu, Feb 15, 2018 at 2:31 AM, Guozhang Wang <wangg...@gmail.com>
> wrote:
> >>
> >> > +1
> >> >
> >> > Ran tests, verified web docs.
> >> >
> >> > On Wed, Feb 14, 2018 at 6:00 PM, Satish Duggana <
> >> satish.dugg...@gmail.com>
> >> > wrote:
> >> >
> >> > > +1 (non-binding)
> >> > >
> >> > > - Ran testAll/releaseTarGzAll on 1.0.1-rc1
> >> > > <https://github.com/apache/kafka/tree/1.0.1-rc1> tag
> >> > > - Ran through quickstart of core/streams
> >> > >
> >> > > Thanks,
> >> > > Satish.
> >> > >
> >> > >
> >> > > On Tue, Feb 13, 2018 at 11:30 PM, Damian Guy <damian@gmail.com>
> >> > wrote:
> >> > >
> >> > > > +1
> >> > > >
> >> > > > Ran tests, verified streams quickstart works
> >> > > >
> >> > > > On Tue, 13 Feb 2018 at 17:52 Damian Guy <damian@gmail.com>
> >> wrote:
> >> > > >
> >> > > > > Thanks Ewen - i had the staging repo set up as profile that i
> >> forgot
> >> > to
> >> > > > > add to my maven command. All good.
> >> > > > >
> >> > > > > On Tue, 13 Feb 2018 at 17:41 Ewen Cheslack-Postava <
> >> > e...@confluent.io>
> >> > > > > wrote:
> >> > > > >
> >> > > > >> Damian,
> >> > > > >>
> >> > > > >> Which quickstart are you referring to? The streams quickstart
> only
> >> > > > >> executes
> >> > > > >> pre-built stuff afaict.
> >> > > > >>
> >> > > > >> In any case, if you're building a maven streams project, did
> you
> >> > > modify
> >> > > > it
> >> > > > >> to point to the staging repository at
> >> > > > >> https://repository.apache.org/content/groups/staging/ in
> addition
> >> > to
> >> > > > the
> >> > > > >> default repos? During rc it wouldn't fetch from maven central
> >> since
> >> > it
> >> > > > >> hasn't been published there yet.
> >> > > > >>
> >> > > > >> If that is configured, more compete maven output would be
> helpful
> >> to
> >> > > > track
> >> > > > >> down where it is failing to resolve the necessary archetype.
> >> > > > >>
> >> > > > >> -Ewen
> >> > > > >>
> >> > > > >> On Tue, Feb 13, 2018 at 3:03 AM, Damian Guy <
> damian@gmail.com
> >> >
> >> > > > wrote:
> >> > > > >>
> >> > > > >> > Hi Ewen,
> >> > > > >> >
> >> > > > >> > I'm trying to run the streams quickstart and I'm getting:
> >> > > > >> > [ERROR] Failed to execute goal
> >> > > > >> > org.apache.maven.plugins:maven-archetype-plugin:3.0.1:
> generate
> >> > > > >> > (default-cli) on project standalone-pom: The desired
> archetype
> >> > does
> >> > > > not
> >> > >

Re: [VOTE] 1.0.1 RC1

2018-02-13 Thread Ewen Cheslack-Postava
Damian,

Which quickstart are you referring to? The streams quickstart only executes
pre-built stuff afaict.

In any case, if you're building a maven streams project, did you modify it
to point to the staging repository at
https://repository.apache.org/content/groups/staging/ in addition to the
default repos? During rc it wouldn't fetch from maven central since it
hasn't been published there yet.

If that is configured, more compete maven output would be helpful to track
down where it is failing to resolve the necessary archetype.

-Ewen

On Tue, Feb 13, 2018 at 3:03 AM, Damian Guy <damian@gmail.com> wrote:

> Hi Ewen,
>
> I'm trying to run the streams quickstart and I'm getting:
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-archetype-plugin:3.0.1:generate
> (default-cli) on project standalone-pom: The desired archetype does not
> exist (org.apache.kafka:streams-quickstart-java:1.0.1)
>
> Something i'm missing?
>
> Thanks,
> Damian
>
> On Tue, 13 Feb 2018 at 10:16 Manikumar <manikumar.re...@gmail.com> wrote:
>
> > +1 (non-binding)
> >
> > ran quick-start, unit tests on the src.
> >
> >
> >
> > On Tue, Feb 13, 2018 at 5:31 AM, Ewen Cheslack-Postava <
> e...@confluent.io>
> > wrote:
> >
> > > Thanks for the heads up, I forgot to drop the old ones, I've done that
> > and
> > > rc1 artifacts should be showing up now.
> > >
> > > -Ewen
> > >
> > >
> > > On Mon, Feb 12, 2018 at 12:57 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> > >
> > > > +1
> > > >
> > > > Ran test suite which passed.
> > > >
> > > > BTW it seems the staging repo hasn't been updated yet:
> > > >
> > > > https://repository.apache.org/content/groups/staging/org/
> > > > apache/kafka/kafka-clients/
> > > >
> > > > On Mon, Feb 12, 2018 at 10:16 AM, Ewen Cheslack-Postava <
> > > e...@confluent.io
> > > > >
> > > > wrote:
> > > >
> > > > > And of course I'm +1 since I've already done normal release
> > validation
> > > > > before posting this.
> > > > >
> > > > > -Ewen
> > > > >
> > > > > On Mon, Feb 12, 2018 at 10:15 AM, Ewen Cheslack-Postava <
> > > > e...@confluent.io
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Hello Kafka users, developers and client-developers,
> > > > > >
> > > > > > This is the second candidate for release of Apache Kafka 1.0.1.
> > > > > >
> > > > > > This is a bugfix release for the 1.0 branch that was first
> released
> > > > with
> > > > > > 1.0.0 about 3 months ago. We've fixed 49 significant issues since
> > > that
> > > > > > release. Most of these are non-critical, but in aggregate these
> > fixes
> > > > > will
> > > > > > have significant impact. A few of the more significant fixes
> > include:
> > > > > >
> > > > > > * KAFKA-6277: Make loadClass thread-safe for class loaders of
> > Connect
> > > > > > plugins
> > > > > > * KAFKA-6185: Selector memory leak with high likelihood of OOM in
> > > case
> > > > of
> > > > > > down conversion
> > > > > > * KAFKA-6269: KTable state restore fails after rebalance
> > > > > > * KAFKA-6190: GlobalKTable never finishes restoring when
> consuming
> > > > > > transactional messages
> > > > > > * KAFKA-6529: Stop file descriptor leak when client disconnects
> > with
> > > > > > staged receives
> > > > > >
> > > > > > Release notes for the 1.0.1 release:
> > > > > > http://home.apache.org/~ewencp/kafka-1.0.1-rc1/
> RELEASE_NOTES.html
> > > > > >
> > > > > > *** Please download, test and vote by Thursday, Feb 15, 5pm PT
> ***
> > > > > >
> > > > > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > > > > > http://kafka.apache.org/KEYS
> > > > > >
> > > > > > * Release artifacts to be voted upon (source and binary):
> > > > > > http://home.apache.org/~ewencp/kafka-1.0.1-rc1/
> > > > > >
> > > > > > * Maven artifacts to be voted upon:
> > > > > > https://repository.apache.org/content/groups/staging/
> > > > > >
> > > > > > * Javadoc:
> > > > > > http://home.apache.org/~ewencp/kafka-1.0.1-rc1/javadoc/
> > > > > >
> > > > > > * Tag to be voted upon (off 1.0 branch) is the 1.0.1 tag:
> > > > > > https://github.com/apache/kafka/tree/1.0.1-rc1
> > > > > >
> > > > > > * Documentation:
> > > > > > http://kafka.apache.org/10/documentation.html
> > > > > >
> > > > > > * Protocol:
> > > > > > http://kafka.apache.org/10/protocol.html
> > > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > Ewen Cheslack-Postava
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] 1.0.1 RC1

2018-02-12 Thread Ewen Cheslack-Postava
Thanks for the heads up, I forgot to drop the old ones, I've done that and
rc1 artifacts should be showing up now.

-Ewen


On Mon, Feb 12, 2018 at 12:57 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> +1
>
> Ran test suite which passed.
>
> BTW it seems the staging repo hasn't been updated yet:
>
> https://repository.apache.org/content/groups/staging/org/
> apache/kafka/kafka-clients/
>
> On Mon, Feb 12, 2018 at 10:16 AM, Ewen Cheslack-Postava <e...@confluent.io
> >
> wrote:
>
> > And of course I'm +1 since I've already done normal release validation
> > before posting this.
> >
> > -Ewen
> >
> > On Mon, Feb 12, 2018 at 10:15 AM, Ewen Cheslack-Postava <
> e...@confluent.io
> > >
> > wrote:
> >
> > > Hello Kafka users, developers and client-developers,
> > >
> > > This is the second candidate for release of Apache Kafka 1.0.1.
> > >
> > > This is a bugfix release for the 1.0 branch that was first released
> with
> > > 1.0.0 about 3 months ago. We've fixed 49 significant issues since that
> > > release. Most of these are non-critical, but in aggregate these fixes
> > will
> > > have significant impact. A few of the more significant fixes include:
> > >
> > > * KAFKA-6277: Make loadClass thread-safe for class loaders of Connect
> > > plugins
> > > * KAFKA-6185: Selector memory leak with high likelihood of OOM in case
> of
> > > down conversion
> > > * KAFKA-6269: KTable state restore fails after rebalance
> > > * KAFKA-6190: GlobalKTable never finishes restoring when consuming
> > > transactional messages
> > > * KAFKA-6529: Stop file descriptor leak when client disconnects with
> > > staged receives
> > >
> > > Release notes for the 1.0.1 release:
> > > http://home.apache.org/~ewencp/kafka-1.0.1-rc1/RELEASE_NOTES.html
> > >
> > > *** Please download, test and vote by Thursday, Feb 15, 5pm PT ***
> > >
> > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > > http://kafka.apache.org/KEYS
> > >
> > > * Release artifacts to be voted upon (source and binary):
> > > http://home.apache.org/~ewencp/kafka-1.0.1-rc1/
> > >
> > > * Maven artifacts to be voted upon:
> > > https://repository.apache.org/content/groups/staging/
> > >
> > > * Javadoc:
> > > http://home.apache.org/~ewencp/kafka-1.0.1-rc1/javadoc/
> > >
> > > * Tag to be voted upon (off 1.0 branch) is the 1.0.1 tag:
> > > https://github.com/apache/kafka/tree/1.0.1-rc1
> > >
> > > * Documentation:
> > > http://kafka.apache.org/10/documentation.html
> > >
> > > * Protocol:
> > > http://kafka.apache.org/10/protocol.html
> > >
> > >
> > > Thanks,
> > > Ewen Cheslack-Postava
> > >
> >
>


Re: [VOTE] 1.0.1 RC1

2018-02-12 Thread Ewen Cheslack-Postava
And of course I'm +1 since I've already done normal release validation
before posting this.

-Ewen

On Mon, Feb 12, 2018 at 10:15 AM, Ewen Cheslack-Postava <e...@confluent.io>
wrote:

> Hello Kafka users, developers and client-developers,
>
> This is the second candidate for release of Apache Kafka 1.0.1.
>
> This is a bugfix release for the 1.0 branch that was first released with
> 1.0.0 about 3 months ago. We've fixed 49 significant issues since that
> release. Most of these are non-critical, but in aggregate these fixes will
> have significant impact. A few of the more significant fixes include:
>
> * KAFKA-6277: Make loadClass thread-safe for class loaders of Connect
> plugins
> * KAFKA-6185: Selector memory leak with high likelihood of OOM in case of
> down conversion
> * KAFKA-6269: KTable state restore fails after rebalance
> * KAFKA-6190: GlobalKTable never finishes restoring when consuming
> transactional messages
> * KAFKA-6529: Stop file descriptor leak when client disconnects with
> staged receives
>
> Release notes for the 1.0.1 release:
> http://home.apache.org/~ewencp/kafka-1.0.1-rc1/RELEASE_NOTES.html
>
> *** Please download, test and vote by Thursday, Feb 15, 5pm PT ***
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> http://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> http://home.apache.org/~ewencp/kafka-1.0.1-rc1/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/
>
> * Javadoc:
> http://home.apache.org/~ewencp/kafka-1.0.1-rc1/javadoc/
>
> * Tag to be voted upon (off 1.0 branch) is the 1.0.1 tag:
> https://github.com/apache/kafka/tree/1.0.1-rc1
>
> * Documentation:
> http://kafka.apache.org/10/documentation.html
>
> * Protocol:
> http://kafka.apache.org/10/protocol.html
>
>
> Thanks,
> Ewen Cheslack-Postava
>


[VOTE] 1.0.1 RC1

2018-02-12 Thread Ewen Cheslack-Postava
Hello Kafka users, developers and client-developers,

This is the second candidate for release of Apache Kafka 1.0.1.

This is a bugfix release for the 1.0 branch that was first released with
1.0.0 about 3 months ago. We've fixed 49 significant issues since that
release. Most of these are non-critical, but in aggregate these fixes will
have significant impact. A few of the more significant fixes include:

* KAFKA-6277: Make loadClass thread-safe for class loaders of Connect
plugins
* KAFKA-6185: Selector memory leak with high likelihood of OOM in case of
down conversion
* KAFKA-6269: KTable state restore fails after rebalance
* KAFKA-6190: GlobalKTable never finishes restoring when consuming
transactional messages
* KAFKA-6529: Stop file descriptor leak when client disconnects with staged
receives

Release notes for the 1.0.1 release:
http://home.apache.org/~ewencp/kafka-1.0.1-rc1/RELEASE_NOTES.html

*** Please download, test and vote by Thursday, Feb 15, 5pm PT ***

Kafka's KEYS file containing PGP keys we use to sign the release:
http://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
http://home.apache.org/~ewencp/kafka-1.0.1-rc1/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/

* Javadoc:
http://home.apache.org/~ewencp/kafka-1.0.1-rc1/javadoc/

* Tag to be voted upon (off 1.0 branch) is the 1.0.1 tag:
https://github.com/apache/kafka/tree/1.0.1-rc1

* Documentation:
http://kafka.apache.org/10/documentation.html

* Protocol:
http://kafka.apache.org/10/protocol.html


Thanks,
Ewen Cheslack-Postava


Re: [VOTE] 1.0.1 RC0

2018-02-09 Thread Ewen Cheslack-Postava
Just a heads up that we had a fix for KAFKA-6529 land to fix a file
descriptor leak. So this RC is dead and I'll be generating RC1 soon.

Thanks,
Ewen

On Wed, Feb 7, 2018 at 11:06 AM, Vahid S Hashemian <
vahidhashem...@us.ibm.com> wrote:

> Hi Ewen,
>
> +1
>
> Building from source and running the quickstart were successful on Ubuntu
> and Windows 10.
>
> Thanks for running the release.
> --Vahid
>
>
>
> From:   Ewen Cheslack-Postava <e...@confluent.io>
> To: d...@kafka.apache.org, users@kafka.apache.org,
> kafka-clie...@googlegroups.com
> Date:   02/05/2018 07:49 PM
> Subject:[VOTE] 1.0.1 RC0
>
>
>
> Hello Kafka users, developers and client-developers,
>
> Sorry for a bit of delay, but I've now prepared the first candidate for
> release of Apache Kafka 1.0.1.
>
> This is a bugfix release for the 1.0 branch that was first released with
> 1.0.0 about 3 months ago. We've fixed 46 significant issues since that
> release. Most of these are non-critical, but in aggregate these fixes will
> have significant impact. A few of the more significant fixes include:
>
> * KAFKA-6277: Make loadClass thread-safe for class loaders of Connect
> plugins
> * KAFKA-6185: Selector memory leak with high likelihood of OOM in case of
> down conversion
> * KAFKA-6269: KTable state restore fails after rebalance
> * KAFKA-6190: GlobalKTable never finishes restoring when consuming
> transactional messages
>
> Release notes for the 1.0.1 release:
> https://urldefense.proofpoint.com/v2/url?u=http-3A__home.
> apache.org_-7Eewencp_kafka-2D1.0.1-2Drc0_RELEASE-5FNOTES.
> html=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-
> kjJc7uSVcviKUc=Z45uiGWoLkCQ5hYes5SiOy1n_pA3ih4Cvmr5W32xx98=
> l1iKa9gDVsN8n73JUsdMj2b_8vCXjo6ZlhPjlHnwLa4=
>
>
> *** Please download, test and vote by Thursday, Feb 8, 12pm PT ***
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> https://urldefense.proofpoint.com/v2/url?u=http-3A__kafka.
> apache.org_KEYS=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_
> itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=Z45uiGWoLkCQ5hYes5SiOy1n_
> pA3ih4Cvmr5W32xx98=FMJWV-i3KbNT9eWV7mxnb9vLofAG8UOyqf13nC60HT0=
>
>
> * Release artifacts to be voted upon (source and binary):
> https://urldefense.proofpoint.com/v2/url?u=http-3A__home.
> apache.org_-7Eewencp_kafka-2D1.0.1-2Drc0_=DwIBaQ=jf_
> iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=
> Z45uiGWoLkCQ5hYes5SiOy1n_pA3ih4Cvmr5W32xx98=wfEb6h21ejMltBiWDsND5C_
> iAR1asfxwSVKbbmNwDRQ=
>
>
> * Maven artifacts to be voted upon:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__
> repository.apache.org_content_groups_staging_=DwIBaQ=jf_
> iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=
> Z45uiGWoLkCQ5hYes5SiOy1n_pA3ih4Cvmr5W32xx98=
> YVQzF4zQchi3ru3UYkgkhgC2LnRRf_NFl1iJId4Iw2Q=
>
>
> * Javadoc:
> https://urldefense.proofpoint.com/v2/url?u=http-3A__home.
> apache.org_-7Eewencp_kafka-2D1.0.1-2Drc0_javadoc_=
> DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-
> kjJc7uSVcviKUc=Z45uiGWoLkCQ5hYes5SiOy1n_pA3ih4Cvmr5W32xx98=
> Y7hXIhHxDGb-M7d6kLZaargoYcLW6kH3agSdqO1SuwQ=
>
>
> * Tag to be voted upon (off 1.0 branch) is the 1.0.1 tag:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.
> com_apache_kafka_tree_1.0.1-2Drc0=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_
> itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=Z45uiGWoLkCQ5hYes5SiOy1n_
> pA3ih4Cvmr5W32xx98=L729TlgNpT-y8WQzeZTsNATg1zFfAsCpXBhXfbu6UXk=
>
>
>
> * Documentation:
> https://urldefense.proofpoint.com/v2/url?u=http-3A__kafka.
> apache.org_10_documentation.html=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_
> itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=Z45uiGWoLkCQ5hYes5SiOy1n_
> pA3ih4Cvmr5W32xx98=DYynoi4X5K3p9DwzxkGYp8vprFK4qvPPQtO1IvQEbME=
>
>
> * Protocol:
> https://urldefense.proofpoint.com/v2/url?u=http-3A__kafka.
> apache.org_10_protocol.html=DwIBaQ=jf_iaSHvJObTbx-
> siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=
> Z45uiGWoLkCQ5hYes5SiOy1n_pA3ih4Cvmr5W32xx98=_BLA3u9JgZKeJ0Kwij9_
> 2J3lnxt8rCCXmptRh4OUPic=
>
>
>
> Please test and verify the release artifacts and submit a vote for this
> RC,
> or report any issues so we can fix them and get a new RC out ASAP!
> Although
> this release vote requires PMC votes to pass, testing, votes, and bug
> reports are valuable and appreciated from everyone.
>
> Thanks,
> Ewen
>
>
>
>
>


[VOTE] 1.0.1 RC0

2018-02-05 Thread Ewen Cheslack-Postava
Hello Kafka users, developers and client-developers,

Sorry for a bit of delay, but I've now prepared the first candidate for
release of Apache Kafka 1.0.1.

This is a bugfix release for the 1.0 branch that was first released with
1.0.0 about 3 months ago. We've fixed 46 significant issues since that
release. Most of these are non-critical, but in aggregate these fixes will
have significant impact. A few of the more significant fixes include:

* KAFKA-6277: Make loadClass thread-safe for class loaders of Connect
plugins
* KAFKA-6185: Selector memory leak with high likelihood of OOM in case of
down conversion
* KAFKA-6269: KTable state restore fails after rebalance
* KAFKA-6190: GlobalKTable never finishes restoring when consuming
transactional messages

Release notes for the 1.0.1 release:
http://home.apache.org/~ewencp/kafka-1.0.1-rc0/RELEASE_NOTES.html

*** Please download, test and vote by Thursday, Feb 8, 12pm PT ***

Kafka's KEYS file containing PGP keys we use to sign the release:
http://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
http://home.apache.org/~ewencp/kafka-1.0.1-rc0/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/

* Javadoc:
http://home.apache.org/~ewencp/kafka-1.0.1-rc0/javadoc/

* Tag to be voted upon (off 1.0 branch) is the 1.0.1 tag:
https://github.com/apache/kafka/tree/1.0.1-rc0


* Documentation:
http://kafka.apache.org/10/documentation.html

* Protocol:
http://kafka.apache.org/10/protocol.html


Please test and verify the release artifacts and submit a vote for this RC,
or report any issues so we can fix them and get a new RC out ASAP! Although
this release vote requires PMC votes to pass, testing, votes, and bug
reports are valuable and appreciated from everyone.

Thanks,
Ewen


Re: [DISCUSS] KIP-174 - Deprecate and remove internal converter configs in WorkerConfig

2018-01-04 Thread Ewen Cheslack-Postava
Sorry I lost track of this thread. If things are in good shape we should
probably vote on this and get the deprecation commit through. It seems like
a good idea as this has been confusing to users from day one.

-Ewen

On Wed, Aug 9, 2017 at 5:18 AM, UMESH CHAUDHARY <umesh9...@gmail.com> wrote:

> Thanks Ewen,
> I just edited the KIP to reflect the changes.
>
> Regards,
> Umesh
>
> On Wed, 9 Aug 2017 at 11:00 Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
>> Great, looking good. I'd probably be a bit more concrete about the
>> Proposed Changes (e.g., "will log an warning if the config is specified"
>> and "since the JsonConverter is the default, the configs will be removed
>> immediately from the example worker configuration files").
>>
>> Other than that this LGTM and I'll be happy to get rid of those settings!
>>
>> -Ewen
>>
>> On Tue, Aug 8, 2017 at 2:54 AM, UMESH CHAUDHARY <umesh9...@gmail.com>
>> wrote:
>>
>>> Hi Ewen,
>>> Sorry, I am bit late in responding this.
>>>
>>> Thanks for your inputs and I've updated the KIP by adding more details
>>> to it.
>>>
>>> Regards,
>>> Umesh
>>>
>>> On Mon, 31 Jul 2017 at 21:51 Ewen Cheslack-Postava <e...@confluent.io>
>>> wrote:
>>>
>>>> On Sun, Jul 30, 2017 at 10:21 PM, UMESH CHAUDHARY <umesh9...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Ewen,
>>>>> Thanks for your comments.
>>>>>
>>>>> 1) Yes, there are some test and java classes which refer these
>>>>> configs, so I will include them as well in "public interface" section of
>>>>> KIP. What should be our approach to deal with the classes and tests which
>>>>> use these configs: we need to change them to use JsonConverter when
>>>>> we plan for removal of these configs right?
>>>>>
>>>>
>>>> I actually meant the references in config/connect-standalone.properties
>>>> and config/connect-distributed.properties
>>>>
>>>>
>>>>> 2) I believe we can target the deprecation in 1.0.0 release as it is
>>>>> planned in October 2017 and then removal in next major release. Let
>>>>> me know your thoughts as we don't have any information for next major
>>>>> release (next to 1.0.0) yet.
>>>>>
>>>>
>>>> That sounds fine. Tough to say at this point what our approach to major
>>>> version bumps will be since the approach to version numbering is changing a
>>>> bit.
>>>>
>>>>
>>>>> 3) Thats a good point and mentioned JIRA can help us to validate the
>>>>> usage of any other converters. I will list this down in the KIP.
>>>>>
>>>>> Let me know if you have some additional thoughts on this.
>>>>>
>>>>> Regards,
>>>>> Umesh
>>>>>
>>>>>
>>>>>
>>>>> On Wed, 26 Jul 2017 at 09:27 Ewen Cheslack-Postava <e...@confluent.io>
>>>>> wrote:
>>>>>
>>>>>> Umesh,
>>>>>>
>>>>>> Thanks for the KIP. Straightforward and I think it's a good change.
>>>>>> Unfortunately it is hard to tell how many people it would affect
>>>>>> since we
>>>>>> can't tell how many people have adjusted that config, but I think
>>>>>> this is
>>>>>> the right thing to do long term.
>>>>>>
>>>>>> A couple of quick things that might be helpful to refine:
>>>>>>
>>>>>> * Note that there are also some references in the example configs
>>>>>> that we
>>>>>> should remove.
>>>>>> * It's nice to be explicit about when the removal is planned. This
>>>>>> lets us
>>>>>> set expectations with users for timeframe (especially now that we
>>>>>> have time
>>>>>> based releases), allows us to give info about the removal timeframe
>>>>>> in log
>>>>>> error messages, and lets us file a JIRA against that release so we
>>>>>> remember
>>>>>> to follow up. Given the update to 1.0.0 for the next release, we may
>>>>>> also
>>>>>> need to adjust how we deal with deprecations/removal if we don't want
>>>>>&g

Re: [DISCUSS] KIP-174 - Deprecate and remove internal converter configs in WorkerConfig

2017-08-08 Thread Ewen Cheslack-Postava
Great, looking good. I'd probably be a bit more concrete about the Proposed
Changes (e.g., "will log an warning if the config is specified" and "since
the JsonConverter is the default, the configs will be removed immediately
from the example worker configuration files").

Other than that this LGTM and I'll be happy to get rid of those settings!

-Ewen

On Tue, Aug 8, 2017 at 2:54 AM, UMESH CHAUDHARY <umesh9...@gmail.com> wrote:

> Hi Ewen,
> Sorry, I am bit late in responding this.
>
> Thanks for your inputs and I've updated the KIP by adding more details to
> it.
>
> Regards,
> Umesh
>
> On Mon, 31 Jul 2017 at 21:51 Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
>> On Sun, Jul 30, 2017 at 10:21 PM, UMESH CHAUDHARY <umesh9...@gmail.com>
>> wrote:
>>
>>> Hi Ewen,
>>> Thanks for your comments.
>>>
>>> 1) Yes, there are some test and java classes which refer these configs,
>>> so I will include them as well in "public interface" section of KIP. What
>>> should be our approach to deal with the classes and tests which use these
>>> configs: we need to change them to use JsonConverter when we plan for
>>> removal of these configs right?
>>>
>>
>> I actually meant the references in config/connect-standalone.properties
>> and config/connect-distributed.properties
>>
>>
>>> 2) I believe we can target the deprecation in 1.0.0 release as it is
>>> planned in October 2017 and then removal in next major release. Let me
>>> know your thoughts as we don't have any information for next major release
>>> (next to 1.0.0) yet.
>>>
>>
>> That sounds fine. Tough to say at this point what our approach to major
>> version bumps will be since the approach to version numbering is changing a
>> bit.
>>
>>
>>> 3) Thats a good point and mentioned JIRA can help us to validate the
>>> usage of any other converters. I will list this down in the KIP.
>>>
>>> Let me know if you have some additional thoughts on this.
>>>
>>> Regards,
>>> Umesh
>>>
>>>
>>>
>>> On Wed, 26 Jul 2017 at 09:27 Ewen Cheslack-Postava <e...@confluent.io>
>>> wrote:
>>>
>>>> Umesh,
>>>>
>>>> Thanks for the KIP. Straightforward and I think it's a good change.
>>>> Unfortunately it is hard to tell how many people it would affect since
>>>> we
>>>> can't tell how many people have adjusted that config, but I think this
>>>> is
>>>> the right thing to do long term.
>>>>
>>>> A couple of quick things that might be helpful to refine:
>>>>
>>>> * Note that there are also some references in the example configs that
>>>> we
>>>> should remove.
>>>> * It's nice to be explicit about when the removal is planned. This lets
>>>> us
>>>> set expectations with users for timeframe (especially now that we have
>>>> time
>>>> based releases), allows us to give info about the removal timeframe in
>>>> log
>>>> error messages, and lets us file a JIRA against that release so we
>>>> remember
>>>> to follow up. Given the update to 1.0.0 for the next release, we may
>>>> also
>>>> need to adjust how we deal with deprecations/removal if we don't want to
>>>> have to wait all the way until 2.0 to remove (though it is unclear how
>>>> exactly we will be handling version bumps from now on).
>>>> * Migration path -- I think this is the major missing gap in the KIP.
>>>> Do we
>>>> need a migration path? If not, presumably it is because people aren't
>>>> using
>>>> any other converters in practice. Do we have some way of validating
>>>> this (
>>>> https://issues.apache.org/jira/browse/KAFKA-3988 might be pretty
>>>> convincing
>>>> evidence)? If there are some users using other converters, how would
>>>> they
>>>> migrate to newer versions which would no longer support that?
>>>>
>>>> -Ewen
>>>>
>>>>
>>>> On Fri, Jul 14, 2017 at 2:37 AM, UMESH CHAUDHARY <umesh9...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi there,
>>>> > Resending as probably missed earlier to grab your attention.
>>>> >
>>>> > Regards,
>>>> > Umesh
>>>> >
>>>> > -- Forwarded message -
>>>> > From: UMESH CHAUDHARY <umesh9...@gmail.com>
>>>> > Date: Mon, 3 Jul 2017 at 11:04
>>>> > Subject: [DISCUSS] KIP-174 - Deprecate and remove internal converter
>>>> > configs in WorkerConfig
>>>> > To: d...@kafka.apache.org <d...@kafka.apache.org>
>>>> >
>>>> >
>>>> > Hello All,
>>>> > I have added a KIP recently to deprecate and remove internal converter
>>>> > configs in WorkerConfig.java class because these have ultimately just
>>>> > caused a lot more trouble and confusion than it is worth.
>>>> >
>>>> > Please find the KIP here
>>>> > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>> > 174+-+Deprecate+and+remove+internal+converter+configs+in+
>>>> WorkerConfig>
>>>> > and
>>>> > the related JIRA here <https://issues.apache.org/
>>>> jira/browse/KAFKA-5540>.
>>>> >
>>>> > Appreciate your review and comments.
>>>> >
>>>> > Regards,
>>>> > Umesh
>>>> >
>>>>
>>>


Re: struggling with runtime Schema in connect

2017-07-31 Thread Ewen Cheslack-Postava
It actually is possible to do so if you adapt the Connect Converter API to
streams. There are a couple of good reasons why we shouldn't require
everyone to just use the same schema:

1. Efficiency

Connect favors a little bit of inefficiency (translating byte[] ->
serialization runtime format -> Connect runtime format, or vice versa) to
get more flexibility for serialization formats and reuse of code. This is a
good choice for Connect, but for Streams apps may or may not be worth the
overhead of doing an additional transformation.

2. API familiarity

Most users of Connect wouldn't even have to be aware of the Connect data
API. However, they are likely already familiar with the APIs used for their
serialization format (whether that's POJOs for Jackson, convenient
SpecificRecords in Avro, or something else). Streams has the flexibility to
deal with whatever serialization format you want (just as Connect can,
simply in different ways). This lets people work with (and specifically
*code against*) a format and APIs they are already comfortable with.

3. Impedance mismatch

One drawback it is important to think about when considering an API that
can safely be converted into a bunch of different formats is that there can
be some data or precision loss. Maybe Connect doesn't have a way to express
UUIDs (something that came up recently in JIRAs) or maybe it doesn't have a
type that corresponds to a type in your serialization library. In that
case, not passing through a generic translation layer is a *good* thing,
since it preserves as much information as possible.

If you like the generic Connect data API and it satisfies your needs, then
I'd highly encourage you to write a simple Kafka Streams serde that uses
Converters! For all intents and purposes, it is no different than any other
serialization library/format!

-Ewen

On Wed, Jul 26, 2017 at 6:11 AM, Koert Kuipers <ko...@tresata.com> wrote:

> just out of curiosity, why does kafka streams not use this runtime data api
> defined in kafka connect?
>
> On Wed, Jul 26, 2017 at 3:10 AM, Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
> > Stephen's explanation is great and accurate :)
> >
> > One of the design goals for Kafka Connect was to not rely on any specific
> > serialization format since that is really orthogonal to getting/sending
> > data from/to other systems. We define the generic *runtime* data API,
> which
> > is what you'll find in the Kafka Connect Java data API. It is intentional
> > that connectors/tasks only interact with these Java objects and the steps
> > to convert this to/from the byte[] stored in Kafka is handled
> independently
> > by a plugin that the *user* can choose to get different serialization
> > formats so they can use whatever serialization format they like with any
> > connector.
> >
> > Koert is correct that Kafka itself sticks to opaque byte[] data and has
> no
> > understanding of the structure or data itself. Connect and Streams are
> both
> > meant to build on top of this raw, low-level functionality and handle
> some
> > higher-level functionality.
> >
> > -Ewen
> >
> > On Mon, Jul 10, 2017 at 8:18 AM, Stephen Durfey <sjdur...@gmail.com>
> > wrote:
> >
> > > Ah, sorry, I have never used the JsonConverter, so didn't know that was
> > > actually a thing. Looking at the code it looks like the converter can
> > > handle json with or without the schema [1]. Take a look at the json
> > > envelope code to get an idea of how the schema is passed along with the
> > > message (also in the json converter code linked below). Setting those
> > > configs will enable to schema to travel along with the data. Just make
> > sure
> > > those configs are set on both workers, if your sink and source tasks
> are
> > in
> > > different jvms.
> > >
> > > [1]
> > > https://github.com/apache/kafka/blob/trunk/connect/json/
> > > src/main/java/org/apache/kafka/connect/json/
> JsonConverter.java#L299-L321
> > >
> > > On Mon, Jul 10, 2017 at 9:06 AM, Koert Kuipers <ko...@tresata.com>
> > wrote:
> > >
> > > > thanks for that explanation.
> > > >
> > > > i use json instead of avro should i use the json serialization that
> > > > serializes both schema and data, so that the schema travels with the
> > data
> > > > from source to sink? so set key.converter.schemas.enable=true and
> > > > value.converter.schemas.enable=true?
> > > >
> > > > is it a correct assumption that kafka-connect wouldn't work if i
> chose
> > > the
> > > > "raw" json serialization that discards t

Re: [DISCUSS] KIP-174 - Deprecate and remove internal converter configs in WorkerConfig

2017-07-31 Thread Ewen Cheslack-Postava
On Sun, Jul 30, 2017 at 10:21 PM, UMESH CHAUDHARY <umesh9...@gmail.com>
wrote:

> Hi Ewen,
> Thanks for your comments.
>
> 1) Yes, there are some test and java classes which refer these configs, so
> I will include them as well in "public interface" section of KIP. What
> should be our approach to deal with the classes and tests which use these
> configs: we need to change them to use JsonConverter when we plan for
> removal of these configs right?
>

I actually meant the references in config/connect-standalone.properties and
config/connect-distributed.properties


> 2) I believe we can target the deprecation in 1.0.0 release as it is
> planned in October 2017 and then removal in next major release. Let me
> know your thoughts as we don't have any information for next major release
> (next to 1.0.0) yet.
>

That sounds fine. Tough to say at this point what our approach to major
version bumps will be since the approach to version numbering is changing a
bit.


> 3) Thats a good point and mentioned JIRA can help us to validate the usage
> of any other converters. I will list this down in the KIP.
>
> Let me know if you have some additional thoughts on this.
>
> Regards,
> Umesh
>
>
>
> On Wed, 26 Jul 2017 at 09:27 Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
>> Umesh,
>>
>> Thanks for the KIP. Straightforward and I think it's a good change.
>> Unfortunately it is hard to tell how many people it would affect since we
>> can't tell how many people have adjusted that config, but I think this is
>> the right thing to do long term.
>>
>> A couple of quick things that might be helpful to refine:
>>
>> * Note that there are also some references in the example configs that we
>> should remove.
>> * It's nice to be explicit about when the removal is planned. This lets us
>> set expectations with users for timeframe (especially now that we have
>> time
>> based releases), allows us to give info about the removal timeframe in log
>> error messages, and lets us file a JIRA against that release so we
>> remember
>> to follow up. Given the update to 1.0.0 for the next release, we may also
>> need to adjust how we deal with deprecations/removal if we don't want to
>> have to wait all the way until 2.0 to remove (though it is unclear how
>> exactly we will be handling version bumps from now on).
>> * Migration path -- I think this is the major missing gap in the KIP. Do
>> we
>> need a migration path? If not, presumably it is because people aren't
>> using
>> any other converters in practice. Do we have some way of validating this (
>> https://issues.apache.org/jira/browse/KAFKA-3988 might be pretty
>> convincing
>> evidence)? If there are some users using other converters, how would they
>> migrate to newer versions which would no longer support that?
>>
>> -Ewen
>>
>>
>> On Fri, Jul 14, 2017 at 2:37 AM, UMESH CHAUDHARY <umesh9...@gmail.com>
>> wrote:
>>
>> > Hi there,
>> > Resending as probably missed earlier to grab your attention.
>> >
>> > Regards,
>> > Umesh
>> >
>> > -- Forwarded message -
>> > From: UMESH CHAUDHARY <umesh9...@gmail.com>
>> > Date: Mon, 3 Jul 2017 at 11:04
>> > Subject: [DISCUSS] KIP-174 - Deprecate and remove internal converter
>> > configs in WorkerConfig
>> > To: d...@kafka.apache.org <d...@kafka.apache.org>
>> >
>> >
>> > Hello All,
>> > I have added a KIP recently to deprecate and remove internal converter
>> > configs in WorkerConfig.java class because these have ultimately just
>> > caused a lot more trouble and confusion than it is worth.
>> >
>> > Please find the KIP here
>> > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > 174+-+Deprecate+and+remove+internal+converter+configs+in+WorkerConfig>
>> > and
>> > the related JIRA here <https://issues.apache.org/jira/browse/KAFKA-5540
>> >.
>> >
>> > Appreciate your review and comments.
>> >
>> > Regards,
>> > Umesh
>> >
>>
>


Re: Kafka Connect distributed mode rebalance

2017-07-26 Thread Ewen Cheslack-Postava
Btw, if you can share, I would be curious what connectors you're using and
why you need so many. I'd be interested if a modification to the connector
could also simplify things for you.

-Ewen

On Wed, Jul 26, 2017 at 12:33 AM, Ewen Cheslack-Postava <e...@confluent.io>
wrote:

> Stephen,
>
> Cool, that is a *lot* of connectors!
>
> Regarding rebalances, the reason this happens is that Kafka Connect is
> trying to keep the total work of the cluster balanced across the workers.
> If you add/remove connectors or the # of workers change, then we need to go
> through another round deciding where that work is done. The way this is
> accomplished is by having the workers coordinate through Kafka's group
> coordination protocol by performing a rebalance. This is very similar to
> how consumer rebalances work -- the members all "rejoin" the group, one
> figures out how to assign work, and then everyone gets their assignments
> and restarts work.
>
> The way this works today is global -- everyone has to stop work, commit
> offsets, then start the process where work is assigned, and finally restart
> work. That's why you're seeing everything stop, then restart.
>
> We know this will eventually become a scalability limit. We've talked
> about other approaches that avoid requiring stopping everything. There's
> not currently a JIRA with more details & ideas, but
> https://issues.apache.org/jira/browse/KAFKA-5505 is filed for the general
> issue. We haven't committed to any specific approach, but I've thought
> through this a bit and have some ideas around how we could make the process
> more incremental such that we don't have to stop *everything* during a
> single rebalance process, instead accepting the cost of some subsequent
> rebalances in order to make each iteration faster/cheaper.
>
> I'm not sure when we'll get these updates in yet. One other thing to
> consider is if it is possible to use fewer connectors at a time. One of our
> goals was to encourage broad copying by default; fewer connectors/tasks
> doesn't necessarily solve your problem, but depending on the connectors
> you're using it is possible it would reduce the time spent
> stopping/starting tasks during the rebalance and alleviate your problem.
>
> -Ewen
>
> On Thu, Jul 20, 2017 at 8:01 AM, Stephen Durfey <sjdur...@gmail.com>
> wrote:
>
>> I'm seeing some behavior with the DistributedHerder that I am trying to
>> understand. I'm working on setting up a cluster of kafka connect nodes and
>> have a relatively large number of connectors to submit to it (392
>> connectors right now that will soon become over 1100). As for the
>> deployment of it I am using chef, and having that PUT connector configs at
>> deployment time so I can create/update any connectors.
>>
>> Everytime I PUT a new connector config to the worker it appears to be
>> initiating an assignment rebalance. I believe this is only happening when
>> submitting a new connector. This is causing all existing and running
>> connectors to stop and restart. My logs end up being flooded with
>> exceptions from the source jdbc task with sql connections being closed and
>> wakeup exceptions in my sink tasks when committing offsets. This causes
>> issues beyond having to wait for a rebalance as restarting the jdbc
>> connectors causes them to re-pull all data, since they are using bulk
>> mode.
>> Everything eventually settles down and all the connectors finish
>> successfully, but each PUT takes progressively longer waiting for a
>> rebalance to finish.
>>
>> If I simply restart the worker nodes and let them only instantiate
>> connectors that have already been successfully submitted everything starts
>> up fine. So, this is only an issue when submitting new connectors over the
>> REST endpoint.
>>
>> So, I'm trying to understand why submitting a new connector causes the
>> rebalancing, but also if there is a better way to deploy the connector
>> configs in distributed mode?
>>
>> Thanks,
>>
>> Stephen
>>
>
>


Re: Kafka Connect distributed mode rebalance

2017-07-26 Thread Ewen Cheslack-Postava
Stephen,

Cool, that is a *lot* of connectors!

Regarding rebalances, the reason this happens is that Kafka Connect is
trying to keep the total work of the cluster balanced across the workers.
If you add/remove connectors or the # of workers change, then we need to go
through another round deciding where that work is done. The way this is
accomplished is by having the workers coordinate through Kafka's group
coordination protocol by performing a rebalance. This is very similar to
how consumer rebalances work -- the members all "rejoin" the group, one
figures out how to assign work, and then everyone gets their assignments
and restarts work.

The way this works today is global -- everyone has to stop work, commit
offsets, then start the process where work is assigned, and finally restart
work. That's why you're seeing everything stop, then restart.

We know this will eventually become a scalability limit. We've talked about
other approaches that avoid requiring stopping everything. There's not
currently a JIRA with more details & ideas, but
https://issues.apache.org/jira/browse/KAFKA-5505 is filed for the general
issue. We haven't committed to any specific approach, but I've thought
through this a bit and have some ideas around how we could make the process
more incremental such that we don't have to stop *everything* during a
single rebalance process, instead accepting the cost of some subsequent
rebalances in order to make each iteration faster/cheaper.

I'm not sure when we'll get these updates in yet. One other thing to
consider is if it is possible to use fewer connectors at a time. One of our
goals was to encourage broad copying by default; fewer connectors/tasks
doesn't necessarily solve your problem, but depending on the connectors
you're using it is possible it would reduce the time spent
stopping/starting tasks during the rebalance and alleviate your problem.

-Ewen

On Thu, Jul 20, 2017 at 8:01 AM, Stephen Durfey  wrote:

> I'm seeing some behavior with the DistributedHerder that I am trying to
> understand. I'm working on setting up a cluster of kafka connect nodes and
> have a relatively large number of connectors to submit to it (392
> connectors right now that will soon become over 1100). As for the
> deployment of it I am using chef, and having that PUT connector configs at
> deployment time so I can create/update any connectors.
>
> Everytime I PUT a new connector config to the worker it appears to be
> initiating an assignment rebalance. I believe this is only happening when
> submitting a new connector. This is causing all existing and running
> connectors to stop and restart. My logs end up being flooded with
> exceptions from the source jdbc task with sql connections being closed and
> wakeup exceptions in my sink tasks when committing offsets. This causes
> issues beyond having to wait for a rebalance as restarting the jdbc
> connectors causes them to re-pull all data, since they are using bulk mode.
> Everything eventually settles down and all the connectors finish
> successfully, but each PUT takes progressively longer waiting for a
> rebalance to finish.
>
> If I simply restart the worker nodes and let them only instantiate
> connectors that have already been successfully submitted everything starts
> up fine. So, this is only an issue when submitting new connectors over the
> REST endpoint.
>
> So, I'm trying to understand why submitting a new connector causes the
> rebalancing, but also if there is a better way to deploy the connector
> configs in distributed mode?
>
> Thanks,
>
> Stephen
>


Re: struggling with runtime Schema in connect

2017-07-26 Thread Ewen Cheslack-Postava
Stephen's explanation is great and accurate :)

One of the design goals for Kafka Connect was to not rely on any specific
serialization format since that is really orthogonal to getting/sending
data from/to other systems. We define the generic *runtime* data API, which
is what you'll find in the Kafka Connect Java data API. It is intentional
that connectors/tasks only interact with these Java objects and the steps
to convert this to/from the byte[] stored in Kafka is handled independently
by a plugin that the *user* can choose to get different serialization
formats so they can use whatever serialization format they like with any
connector.

Koert is correct that Kafka itself sticks to opaque byte[] data and has no
understanding of the structure or data itself. Connect and Streams are both
meant to build on top of this raw, low-level functionality and handle some
higher-level functionality.

-Ewen

On Mon, Jul 10, 2017 at 8:18 AM, Stephen Durfey  wrote:

> Ah, sorry, I have never used the JsonConverter, so didn't know that was
> actually a thing. Looking at the code it looks like the converter can
> handle json with or without the schema [1]. Take a look at the json
> envelope code to get an idea of how the schema is passed along with the
> message (also in the json converter code linked below). Setting those
> configs will enable to schema to travel along with the data. Just make sure
> those configs are set on both workers, if your sink and source tasks are in
> different jvms.
>
> [1]
> https://github.com/apache/kafka/blob/trunk/connect/json/
> src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L299-L321
>
> On Mon, Jul 10, 2017 at 9:06 AM, Koert Kuipers  wrote:
>
> > thanks for that explanation.
> >
> > i use json instead of avro should i use the json serialization that
> > serializes both schema and data, so that the schema travels with the data
> > from source to sink? so set key.converter.schemas.enable=true and
> > value.converter.schemas.enable=true?
> >
> > is it a correct assumption that kafka-connect wouldn't work if i chose
> the
> > "raw" json serialization that discards the schema?
> >
> > On Sun, Jul 9, 2017 at 1:10 PM, Stephen Durfey 
> wrote:
> >
> > > I'll try to answer this for you. I'm going to assume you are using the
> > > pre-packaged kafka connect distro from confluent.
> > >
> > > org.apache.kafka.connect.data.Schema is an abstraction of the type
> > > definition for the data being passed around. How that is defined
> > > generally falls onto the connector being used. The source connector can
> > > provide the schema definition information and make it available for the
> > > sink connector to infer from provided information by the source
> > connector.
> > > How that is done is up to the connector developer (since, as you
> mention
> > > kafka only cares about bytes). I'll use a specific example to highlight
> > > some of the pieces that play into it.
> > >
> > > For instance, the confluent JDBC source connector uses table
> information
> > > and dynamically generates the o.a.k.c.d.Schema from that. That
> definition
> > > becomes part of the SourceRecord. When the worker goes to serialize
> that
> > > payload to send to kafka, it uses a converter class [1]. The specific
> > class
> > > is defined by 'key.converter' and 'value.converter' for the worker
> > > definition. The worker calls those specific classes when it needs to
> > > serialize [2]. This is where the developer can insert logic to inform
> > > downstream consumers of the schema of the data written to kafka. In the
> > > pre-packaged distro, it uses the AvroConverter class (also provided by
> > > confluent) [3]. This class uses custom serializers and deserializers
> [4]
> > to
> > > interact with the schema registry. The schema is turned into an Avro
> > Schema
> > > and registered with the schema registry. The schema registry in
> > > turn returns an id to use to retrieve the schema at a later time. The
> id
> > > is serialized in the front of the bytes being written to kafka.
> > Downstream
> > > uses can use the custom deserializer to get back to the original
> message
> > > generated by the source connector.
> > >
> > > I hope this helps.
> > >
> > >
> > > [1]
> > > https://github.com/apache/kafka/blob/trunk/connect/api/
> > > src/main/java/org/apache/kafka/connect/storage/Converter.java
> > >
> > > [2]
> > > https://github.com/apache/kafka/blob/41e676d29587042994a72baa5000a8
> > > 861a075c8c/connect/runtime/src/main/java/org/apache/
> > kafka/connect/runtime/
> > > WorkerSourceTask.java#L182-L183
> > >
> > > [3]
> > > https://github.com/confluentinc/schema-registry/
> > > blob/master/avro-converter/src/main/java/io/confluent/
> > > connect/avro/AvroConverter.java
> > >
> > > [4]
> > > https://github.com/confluentinc/schema-registry/
> > > tree/master/avro-serializer/src/main/java/io/confluent/
> kafka/serializers
> > >
> > > On Sat, Jul 8, 2017 

Re: Kafka Connect Embedded API

2017-07-26 Thread Ewen Cheslack-Postava
The vast majority of KIP-26 has been implemented. Unfortunately, the
embedded API is still one of the gaps that has not yet been implemented. It
likely requires some additional design work as only a prototype API was
proposed in the KIP describing the framework as a whole.

-Ewen

On Wed, Jul 12, 2017 at 10:45 AM, Debasish Ghosh 
wrote:

> Thanks .. I found this blog post from Confluent
> https://www.confluent.io/blog/hello-world-kafka-connect-kafka-streams/ ..
> It's possibly pre-KIP 26, as the code base says ..
>
> /**
>  * This is only a temporary extension to Kafka Connect runtime until there
> is an Embedded API as per KIP-26
>  */
>
> Has KIP-26 been implemented as of 0.11 ? And do users use embedded Kafka
> Connect ?
>
> regards.
>
> On Wed, Jul 12, 2017 at 8:51 PM, David Garcia 
> wrote:
>
> > I would just look at an example:
> >  https://github.com/confluentinc/kafka-connect-jdbc
> > https://github.com/confluentinc/kafka-connect-hdfs
> >
> >
> >
> >
> >
> > On 7/12/17, 8:27 AM, "Debasish Ghosh"  wrote:
> >
> > Hi -
> >
> > I would like to use the embedded API of Kafka Connect as per
> > https://cwiki.apache.org/confluence/pages/viewpage.
> > action?pageId=58851767.
> > But cannot find enough details regarding the APIs and implementation
> > models. Is there any sample example that gives enough details about
> > embedded Kafka Connect APIs ?
> >
> > My use case is as follows :-
> >
> > I have a Kafka Streams app from which I plan to use a HDFS sink
> > connector
> > to write to HDFS. And plan to use embedded Kafka Connect API.
> >
> > regards.
> >
> > --
> > Debasish Ghosh
> > http://manning.com/ghosh2
> > http://manning.com/ghosh
> >
> > Twttr: @debasishg
> > Blog: http://debasishg.blogspot.com
> > Code: http://github.com/debasishg
> >
> >
> >
>
>
> --
> Debasish Ghosh
> http://manning.com/ghosh2
> http://manning.com/ghosh
>
> Twttr: @debasishg
> Blog: http://debasishg.blogspot.com
> Code: http://github.com/debasishg
>


Re: Kafka connector throughput reduction upon avro schema change

2017-07-26 Thread Ewen Cheslack-Postava
What is your setting for schema.compatibility? I suspect the issue is
probably that it is defaulting to NONE which would cause the connector to
roll a new file when the schema changes (which will be frequent with data
that is interleaved with different schemas).

If you set it to BACKWARDS then the records would be properly projected and
not require rolling files. Of course this assumes you are ok with records
being projected to the latest schema.

-Ewen

On Thu, Jul 6, 2017 at 10:04 AM, Dave Hamilton 
wrote:

> Bumping this. Has anyone here observed this in their Kafka connect
> deployments?
>
> Thanks,
> Dave
>
>
> On 5/26/17, 1:44 PM, "Dave Hamilton"  wrote:
>
> We are currently using the Kafka S3 connector to ship Avro data to S3.
> We made a change to one of our Avro schemas and have noticed consumer
> throughput on the Kafka connector drop considerably. I am wondering if
> there is anything we can do to avoid such issues when we update schemas in
> the future?
>
> This is what I believe is happening:
>
>
> · The avro producer application is running on 12 instances.
> They are restarted in a rolling fashion, switching from producing schema
> version 1 before the restart to schema version 2 afterward.
>
> · While the rolling restart is occurring, data on schema
> version 1 and schema version 2 is simultaneously being written to the topic.
>
> · The Kafka connector has to close the current avro file for a
> partition and ship it whenever it detects a schema change, which is
> happening several times due to the rolling nature of the schema update
> deployment and the mixture of message versions being written during this
> time. This process causes the overall consumer throughput to plummet.
>
> Am I reasoning correctly about what we’re observing here? Is there any
> way to avoid this when we change schemas (short of stopping all instances
> of the service and bringing them up together on the new schema version)?
>
> Thanks,
> Dave
>
>
>
>


Re: [DISCUSS] KIP-174 - Deprecate and remove internal converter configs in WorkerConfig

2017-07-25 Thread Ewen Cheslack-Postava
Umesh,

Thanks for the KIP. Straightforward and I think it's a good change.
Unfortunately it is hard to tell how many people it would affect since we
can't tell how many people have adjusted that config, but I think this is
the right thing to do long term.

A couple of quick things that might be helpful to refine:

* Note that there are also some references in the example configs that we
should remove.
* It's nice to be explicit about when the removal is planned. This lets us
set expectations with users for timeframe (especially now that we have time
based releases), allows us to give info about the removal timeframe in log
error messages, and lets us file a JIRA against that release so we remember
to follow up. Given the update to 1.0.0 for the next release, we may also
need to adjust how we deal with deprecations/removal if we don't want to
have to wait all the way until 2.0 to remove (though it is unclear how
exactly we will be handling version bumps from now on).
* Migration path -- I think this is the major missing gap in the KIP. Do we
need a migration path? If not, presumably it is because people aren't using
any other converters in practice. Do we have some way of validating this (
https://issues.apache.org/jira/browse/KAFKA-3988 might be pretty convincing
evidence)? If there are some users using other converters, how would they
migrate to newer versions which would no longer support that?

-Ewen


On Fri, Jul 14, 2017 at 2:37 AM, UMESH CHAUDHARY 
wrote:

> Hi there,
> Resending as probably missed earlier to grab your attention.
>
> Regards,
> Umesh
>
> -- Forwarded message -
> From: UMESH CHAUDHARY 
> Date: Mon, 3 Jul 2017 at 11:04
> Subject: [DISCUSS] KIP-174 - Deprecate and remove internal converter
> configs in WorkerConfig
> To: d...@kafka.apache.org 
>
>
> Hello All,
> I have added a KIP recently to deprecate and remove internal converter
> configs in WorkerConfig.java class because these have ultimately just
> caused a lot more trouble and confusion than it is worth.
>
> Please find the KIP here
>  174+-+Deprecate+and+remove+internal+converter+configs+in+WorkerConfig>
> and
> the related JIRA here .
>
> Appreciate your review and comments.
>
> Regards,
> Umesh
>


Re: [DISCUSS] KIP-163: Lower the Minimum Required ACL Permission of OffsetFetch

2017-07-24 Thread Ewen Cheslack-Postava
Vahid,

Thanks for the KIP. I think we're mostly in violent agreement that the lack
of any Write permissions on consumer groups is confusing. Unfortunately
it's a pretty annoying issue to fix since it would require an increase in
permissions. More generally, I think it's unfortunate because by squeezing
all permissions into the lowest two levels, we have no room for refinement,
e.g. if we realize some permission needs to have a lower level of access
but higher than Describe, without adding new levels.

I'm +1 on the KIP. I don't think it's ideal given the discussion of Read vs
Write since I think Read is the correct permission in theory, but given
where we are now it makes sense.

Regarding the extra food for thought, I think such a change would require
some plan for how to migrate people over to it. The main proposal in the
KIP works without any migration plan because it is reducing the required
permissions, but changing the requirement for listing a group to Describe
(Group) would be adding/changing the requirements, which would be backwards
incompatible. I'd be open to doing it, but it'd require some thought about
how it would impact users and how we'd migrate them to the updated rule (or
just agree that it is a bug and that including upgrade notes would be
sufficient).

-Ewen

On Mon, Jul 10, 2017 at 1:12 PM, Vahid S Hashemian <
vahidhashem...@us.ibm.com> wrote:

> I'm bumping this up again to get some feedback, especially from some of
> the committers, on the KIP and on the note below.
>
> Thanks.
> --Vahid
>
>
>
>
> From:   "Vahid S Hashemian" 
> To: d...@kafka.apache.org
> Cc: "Kafka User" 
> Date:   06/21/2017 12:49 PM
> Subject:Re: [DISCUSS] KIP-163: Lower the Minimum Required ACL
> Permission of OffsetFetch
>
>
>
> I appreciate everyone's feedback so far on this KIP.
>
> Before starting a vote, I'd like to also ask for feedback on the
> "Additional Food for Thought" section in the KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 163%3A+Lower+the+Minimum+Required+ACL+Permission+of+OffsetFetch#KIP-163:
> LowertheMinimumRequiredACLPermissionofOffsetFetch-AdditionalFoodforThought
>
> I just added some more details in that section, which I hope further
> clarifies the suggestion there.
>
> Thanks.
> --Vahid
>
>
>
>
>
>
>
>
>
>
>


Re: Kafka broker startup issue

2017-05-23 Thread Ewen Cheslack-Postava
Version 2 of UpdateMetadataRequest does not exist in version 0.9.0.1. This
suggests that you have a broker with a newer version of Kafka running
against the same ZK broker. Do you have any other versions running? Or is
it possible this is a shared ZK cluster and you're not using a namespace
within ZK for each cluster?

-Ewen

On Mon, May 22, 2017 at 12:33 AM, dhiraj prajapati 
wrote:

> Hi,
> I am getting the below exception while starting kafka broker 0.9.0.1:
>
> kafka.common.KafkaException: Version 2 is invalid for
> UpdateMetadataRequest. Valid versions are 0 or 1.
> at
> kafka.api.UpdateMetadataRequest$.readFrom(UpdateMetadataRequest.scala:58)
> at kafka.api.RequestKeys$$anonfun$7.apply(RequestKeys.scala:54)
> at kafka.api.RequestKeys$$anonfun$7.apply(RequestKeys.scala:54)
> at kafka.network.RequestChannel$Request.(RequestChannel.
> scala:66)
> at kafka.network.Processor$$anonfun$run$11.apply(
> SocketServer.scala:426)
> at kafka.network.Processor$$anonfun$run$11.apply(
> SocketServer.scala:421)
> at scala.collection.Iterator$class.foreach(Iterator.scala:742)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at kafka.network.Processor.run(SocketServer.scala:421)
> at java.lang.Thread.run(Thread.java:745)
>
> What could be the issue?
>
> Regards,
> Dhiraj
>


Re: Reg: [VOTE] KIP 157 - Add consumer config options to streams reset tool

2017-05-16 Thread Ewen Cheslack-Postava
+1 (binding)

I mentioned this in the PR that triggered this:

> KIP is accurate, though this is one of those things that we should
probably get a KIP for a standard set of config options across all tools so
additions like this can just fall under the umbrella of that KIP...

I think it would be great if someone wrote up a small KIP providing some
standardized settings that we could get future additions automatically
umbrella'd under, e.g. no need to do a KIP if just adding a consumer.config
or consumer-property config conforming to existing expectations for other
tools. We could also standardize on a few other settings names that are
inconsistent across different tools and set out a clear path forward for
future tools.

I think I still have at least one open PR from when I first started on the
project where I was trying to clean up some command line stuff to be more
consistent. This has been an issue for many years now...

-Ewen



On Tue, May 16, 2017 at 1:12 AM, Eno Thereska 
wrote:

> +1 thanks.
>
> Eno
> > On 16 May 2017, at 04:20, BigData dev  wrote:
> >
> > Hi All,
> > Given the simple and non-controversial nature of the KIP, I would like to
> > start the voting process for KIP-157: Add consumer config options to
> > streams reset tool
> >
> > *https://cwiki.apache.org/confluence/display/KAFKA/KIP+
> 157+-+Add+consumer+config+options+to+streams+reset+tool
> >  157+-+Add+consumer+config+options+to+streams+reset+tool>*
> >
> >
> > The vote will run for a minimum of 72 hours.
> >
> > Thanks,
> >
> > Bharat
>
>


Re: Kafka Connect CPU spikes bring down Kafka Connect workers

2017-05-16 Thread Ewen Cheslack-Postava
On Mon, May 15, 2017 at 2:06 PM, Phillip Mann  wrote:

> Currently, Kafka Connect experiences a spike in CPU usage which causes
> Kafka Connect to crash.


What kind of crash? Can you provide an error or stacktrace?


>   There is really no useful information from the logs to help me
> understand what is causing this to happen.  Is this a known issue?  If it
> matters, my configuration settings are as follows:
>
> format.class":"com.trulia.footprint.eventexportkafkaconnect.SequenceFileFormat",
> (our custom Sequence File format output)
> "connector.class":"io.confluent.connect.s3.S3SinkConnector",
> "tasks.max":"10",
> "topics":"truliaTopic",
> "flush.size":"15",
> "s3.part.size":"5242880",
> "s3.bucket.name":"truliaBucket",
> "storage.class":"io.confluent.connect.s3.storage.S3Storage",
> "partitioner.class":"com.trulia.footprint.eventexportkafkaconnect.FootprintEventPartitioner",
> (our custom partitioner)
> "schema.generator.class": 
> "io.confluent.connect.storage.hive.schema.DefaultSchemaGenerator",
> "schema.compatibility": "NONE",
> "topics.dir":"truliaOutput"
>

There are multiple custom plugins here -- the format.class and
partitioner.class implementations are not built-in. What are the
implementations here, and could they affect CPU usage? I'd normally expect
these to have more of an effect on memory usage, but could they also be
affecting CPU usage? Is there any process, compression, etc?

-Ewen


Re: offset commitment from another client

2017-04-18 Thread Ewen Cheslack-Postava
Consumers are responsible for committing offsets, not brokers. See
http://kafka.apache.org/documentation.html#design_consumerposition for more
of an explanation of how this is tracked. The brokers help coordinate
this/store the offsets, but it is the consumers that decide when to commit
offsets (indicating they have processed the data).

-Ewen

On Mon, Mar 27, 2017 at 2:56 AM, Vova Shelgunov  wrote:

> Hi,
>
> I have an application which consumes messages from Kafka, then it creates a
> Docker container via Mesos which processes incoming message (image), but I
> need to commit an offset only once message is processed inside a Docker
> container. So basically I need to commit offset from another broker (that
> is running in a container).
>
> Will it work?
>
> Thanks
>


Re: even if i pass key no change in partition

2017-04-18 Thread Ewen Cheslack-Postava
Do you have more than 1 partition? You may have an auto-created topic with
only 1 partition, in which case the partition of messages will *always* be
the same, regardless of key.

-Ewen

On Fri, Mar 24, 2017 at 5:52 AM, Laxmi Narayan  wrote:

> Hi,
>
> I am passing key in producer but still no change in partition.
>
>
> I can see in producer response key value but no change in partition.
>
> *This is how my props looks:*
>
> props.put("bootstrap.servers",  "localhost:9092");
> props.put("group.id",   "test");
> props.put("enable.auto.commit", "true");
> props.put("auto.commit.interval.ms","1000");
> props.put("session.timeout.ms", "3");
> props.put("linger.ms",  "1");
> props.put("key.deserializer",
> "org.apache.kafka.common.serialization.StringDeserializer");
> props.put("value.deserializer",
> "org.apache.kafka.common.serialization.StringDeserializer");
> props.put("key.serializer",
> "org.apache.kafka.common.serialization.StringSerializer");
> props.put("value.serializer",
> "org.apache.kafka.common.serialization.StringSerializer");
> props.put("partitioner.class",
> "org.apache.kafka.clients.producer.internals.DefaultPartitioner");
>
>
>
> *This is my output : *
> ConsumerRecord(topic = test, partition = 0, offset = 449, CreateTime =
> 1490359767112, checksum = 1987036294, serialized key size = 3, serialized
> value size = 16, key = 172, value = hello world - 86)
>
>
>
>
>
> *Regards,*
> *Laxmi Narayan Patel*
> *MCA NIT Durgapur (2011-2014)*
> *Mob:-9741292048,8345847473*
>


Re: offset.storage.filename configuration in kafka-connect-hdfs

2017-04-09 Thread Ewen Cheslack-Postava
The offset file is marked as required so we can fail early & fast, but if
you only run sink connectors then offsets will be stored in Kafka's normal
offsets topic and the file will never need to be created. (And the HDFS
connector is even more unusual in that it doesn't even rely on Kafka
offsets because it stores offset information in HDFS to provide exactly
once delivery guarantees.)

-Ewen

On Tue, Mar 7, 2017 at 11:15 PM, FEI Aggie 
wrote:

> Hi,
> I'm running kafka-connect-hdfs 3.1.1.
> offset.storage.filename is a required configuration for standalone
> connector. But when I set this parameter in worker configuration file for
> kafka-connect-hdfs with standlone mode:
> offset.storage.file.filename=/mnt/data/connect.offsets
>
> It never works. The configured file is not generated.
> Anyone know this issue?
>
> Thanks!
> Aggie
>


Re: Kafka Connect behaves weird in case of zombie Kafka brokers. Also, zombie brokers?

2017-04-09 Thread Ewen Cheslack-Postava
Is that node the only bootstrap broker provided? If the Connect worker was
pinned to *only* that broker, it wouldn't have any chance of recovering
correct cluster information from the healthy brokers.

It sounds like there was a separate problem as well (the broker should have
figured out it was in a bad state wrt ZK), but normally we would expect
Connect to detect the issue, mark the coordinator dead (as it did) and then
refresh metadata to figure out which broker it should be talking to now.
There are some known edge cases around how initial cluster discovery works
which *might* be able to get you stuck in this situation.

-Ewen

On Tue, Mar 21, 2017 at 10:43 PM, Anish Mashankar 
wrote:

> Hello everyone,
> We are running a 5 broker Kafka v0.10.0.0 cluster on AWS. Also, the connect
> api is in v0.10.0.0.
> It was observed that the distributed kafka connector went into infinite
> loop of log message of
>
> (Re-)joining group connect-connect-elasticsearch-indexer.
>
> And after a little more digging. There was another infinite loop of set of
> log messages
>
> *Discovered coordinator 1.2.3.4:9092 (id:  rack: null) for group x*
>
> *Marking the coordinator 1.2.3.4:9092  (id: 
> rack:
> null) dead for group x*
>
> Restarting Kafka connect did not help.
>
> Looking at zookeeper, we realized that broker 1.2.3.4 had left the
> zookeeper cluster. It had happened due to a timeout when interacting with
> zookeeper. The broker was also the controller. Failover of controller
> happened, and the applications were fine, but few days later, we started
> facing the above mentioned issue. To add to the surprise, the kafka daemon
> was still running on the host but was not trying to contact zookeeper in
> any time. Hence, zombie broker.
>
> Also, a connect cluster spread across multiple hosts was not working,
> however a single connector worked.
>
> After replacing the EC2 instance for the broker 1.2.3.4, kafka connect
> cluster started working fine. (same broker ID)
>
> I am not sure if this is a bug. Kafka connect shouldn't be trying the same
> broker if it is not able establish connection. We use consul for service
> discovery. As broker was running and 9092 port was active, consul reported
> it was working and redirected dns queries to that broker when the connector
> tried to connect. We have disabled DNS caching in the java options, and
> Kafka connect should've tried to connect to some other host each time it
> tried to query, but instead it only tried on 1.2.3.4 broker.
>
> Does kafka connect internally cache DNS? Is there a debugging step I am
> missing here?
> --
> Anish Samir Mashankar
>


Re: kafka connector for mongodb as a source

2017-04-09 Thread Ewen Cheslack-Postava
There is some log noise in there from Reflections, but it does look like
your connector & task are being created:

[2017-03-27 18:33:00,057] INFO Instantiated task mongodb-0 with version
0.10.0.1 of type org.apache.kafka.connect.mongodb.MongodbSourceTask
(org.apache.kafka.connect.runtime.Worker:264)

And I see the producer configs for the source task's underlying producer
being logged. Then we see the following, suggesting some sort of connection
is being made successfully:

[2017-03-27 18:33:00,397] INFO Source task WorkerSourceTask{id=mongodb-0}
finished initialization and start
(org.apache.kafka.connect.runtime.WorkerSourceTask:138)
[2017-03-27 18:33:00,442] INFO No server chosen by
ReadPreferenceServerSelector{readPreference=primary} from cluster
description ClusterDescription{type=UNKNOWN, connectionMode=SINGLE,
all=[ServerDescription{address=localhost:27017, type=UNKNOWN,
state=CONNECTING}]}. Waiting for 3 ms before timing out
(org.mongodb.driver.cluster:71)
[2017-03-27 18:33:00,455] INFO Opened connection
[connectionId{localValue:1, serverValue:4}] to localhost:27017
(org.mongodb.driver.connection:71)
[2017-03-27 18:33:00,457] INFO Monitor thread successfully connected to
server with description ServerDescription{address=localhost:27017,
type=STANDALONE, state=CONNECTED, ok=true,
version=ServerVersion{versionList=[3, 2, 12]}, minWireVersion=0,
maxWireVersion=4, maxDocumentSize=16777216, roundTripTimeNanos=536169}
(org.mongodb.driver.cluster:71)
[2017-03-27 18:33:00,491] INFO Opened connection
[connectionId{localValue:2, serverValue:5}] to localhost:27017
(org.mongodb.driver.connection:71)

But then the logs stop. The framework should just be calling poll() on your
source task. Perhaps you could add some logging to your code to give some
hint as to where it is getting stuck? You could also try increasing the log
level for the framework to DEBUG or even TRACE.

-Ewen

On Mon, Mar 27, 2017 at 6:22 AM, VIVEK KUMAR MISHRA 13BIT0066 <
vivekkumar.mishra2...@vit.ac.in> wrote:

> Hi All,
>
> I am creating kafka connector for mongodb as a source .My connector is
> starting and connecting with kafka but it is not committing any offset.
>
> This is output after starting connector.
>
> [root@localhost kafka_2.11-0.10.1.1]# bin/connect-standalone.sh
> config/connect-standalone.properties config/mongodb.properties
> [2017-03-27 18:32:58,019] INFO StandaloneConfig values:
> rest.advertised.host.name = null
> task.shutdown.graceful.timeout.ms = 5000
> rest.host.name = null
> rest.advertised.port = null
> bootstrap.servers = [localhost:9092]
> offset.flush.timeout.ms = 5000
> offset.flush.interval.ms = 1
> rest.port = 8083
> internal.key.converter = class
> org.apache.kafka.connect.json.JsonConverter
> access.control.allow.methods =
> access.control.allow.origin =
> offset.storage.file.filename = /tmp/connect.offsets
> internal.value.converter = class
> org.apache.kafka.connect.json.JsonConverter
> value.converter = class org.apache.kafka.connect.json.
> JsonConverter
> key.converter = class org.apache.kafka.connect.json.JsonConverter
>  (org.apache.kafka.connect.runtime.standalone.StandaloneConfig:178)
> [2017-03-27 18:32:58,162] INFO Logging initialized @609ms
> (org.eclipse.jetty.util.log:186)
> [2017-03-27 18:32:58,392] INFO Kafka Connect starting
> (org.apache.kafka.connect.runtime.Connect:52)
> [2017-03-27 18:32:58,392] INFO Herder starting
> (org.apache.kafka.connect.runtime.standalone.StandaloneHerder:70)
> [2017-03-27 18:32:58,393] INFO Worker starting
> (org.apache.kafka.connect.runtime.Worker:113)
> [2017-03-27 18:32:58,393] INFO Starting FileOffsetBackingStore with file
> /tmp/connect.offsets
> (org.apache.kafka.connect.storage.FileOffsetBackingStore:60)
> [2017-03-27 18:32:58,398] INFO Worker started
> (org.apache.kafka.connect.runtime.Worker:118)
> [2017-03-27 18:32:58,398] INFO Herder started
> (org.apache.kafka.connect.runtime.standalone.StandaloneHerder:72)
> [2017-03-27 18:32:58,398] INFO Starting REST server
> (org.apache.kafka.connect.runtime.rest.RestServer:98)
> [2017-03-27 18:32:58,493] INFO jetty-9.2.15.v20160210
> (org.eclipse.jetty.server.Server:327)
> [2017-03-27 18:32:59,621] INFO HV01: Hibernate Validator 5.1.2.Final
> (org.hibernate.validator.internal.util.Version:27)
> Mar 27, 2017 6:32:59 PM org.glassfish.jersey.internal.Errors logErrors
> WARNING: The following warnings have been detected: WARNING: The
> (sub)resource method listConnectors in
> org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource
> contains
> empty path annotation.
> WARNING: The (sub)resource method createConnector in
> org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource
> contains
> empty path annotation.
> WARNING: The (sub)resource method listConnectorPlugins in
> org.apache.kafka.connect.runtime.rest.resources.ConnectorPluginsResource
> contains empty 

Re: Are there Connector artifacts in Confluent or any other Maven repository?

2017-03-21 Thread Ewen Cheslack-Postava
Yes, these get published to Confluent's maven repository. Follow the
instructions here
http://docs.confluent.io/current/installation.html#installation-maven for
adding the Confluent maven repository to your project and then add a
dependency for the connector to your project (e.g. for that partitioner you
need io.confluent.kafka-connect-hdfs). Be sure to add it as a provided
dependency so you don't actually get an extra copy of the connector and its
dependencies.

-Ewen

On Tue, Mar 21, 2017 at 1:57 PM, Phillip Mann  wrote:

> I am trying to migrate from StreamX (https://github.com/qubole/streamx)
> to use the official Confluent S3 connector (https://github.com/
> confluentinc/kafka-connect-storage-cloud).  Part of my implementation of
> Kafka Connect requires a custom partitioner.  This partitioner originally
> extended the Partitioner defined here (https://github.com/
> confluentinc/kafka-connect-hdfs/blob/master/src/main/
> java/io/confluent/connect/hdfs/partitioner/Partitioner.java).  This was
> possible because I would build StreamX and add it to my companie’s artifact
> repository.  However, before I fork a bunch of different Confluent projects
> and then add them to my companies repository, I would like to know if it
> would be possible to import different Confluent projects such as HDFS
> connector and S3 connector through Maven so that I can use code from these
> projects.  If this doesn’t exist, why doesn’t Confluent add these artifacts
> to the Confluent repository?  Thanks for your help!
>
> Phillip
>


[ANNOUNCE] Apache Kafka 0.10.2.0 Released

2017-02-22 Thread Ewen Cheslack-Postava
The Apache Kafka community is pleased to announce the release for Apache
Kafka 0.10.2.0. This is a feature release which includes the completion
of 15 KIPs, over 200 bug fixes and improvements, and more than 500 pull
requests merged.

All of the changes in this release can be found in the release notes:
https://archive.apache.org/dist/kafka/0.10.2.0/RELEASE_NOTES.html

Apache Kafka is a distributed streaming platform with four four core
APIs:

** The Producer API allows an application to publish a stream records to
one or more Kafka topics.

** The Consumer API allows an application to subscribe to one or more
topics and process the stream of records produced to them.

** The Streams API allows an application to act as a stream processor,
consuming an input stream from one or more topics and producing an
output
stream to one or more output topics, effectively transforming the input
streams to output streams.

** The Connector API allows building and running reusable producers or
consumers that connect Kafka topics to existing applications or data
systems. For example, a connector to a relational database might capture
every change to a table.three key capabilities:


With these APIs, Kafka can be used for two broad classes of application:

** Building real-time streaming data pipelines that reliably get data
between systems or applications.

** Building real-time streaming applications that transform or react to
the
streams of data.


You can download the source release from
https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.2.0/kafka-0.10.2.0-src.tgz

and binary releases from
https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.2.0/kafka_2.11-0.10.2.0.tgz
https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.2.0/kafka_2.10-0.10.2.0.tgz
https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.2.0/kafka_2.12-0.10.2.0.tgz
(experimental 2.12 artifact)

Thanks to the 101 contributors on this release!

Akash Sethi, Alex Loddengaard, Alexey Ozeritsky, amethystic, Andrea
Cosentino, Andrew Olson, Andrew Stevenson, Anton Karamanov, Antony
Stubbs, Apurva Mehta, Arun Mahadevan, Ashish Singh, Balint Molnar, Ben
Stopford, Bernard Leach, Bill Bejeck, Colin P. Mccabe, Damian Guy, Dan
Norwood, Dana Powers, dasl, Derrick Or, Dong Lin, Dustin Cote, Edoardo
Comar, Edward Ribeiro, Elias Levy, Emanuele Cesena, Eno Thereska, Ewen
Cheslack-Postava, Flavio Junqueira, fpj, Geoff Anderson, Guozhang Wang,
Gwen Shapira, Hikiko Murakami, Himani Arora, himani1, Hojjat Jafarpour,
huxi, Ishita Mandhan, Ismael Juma, Jakub Dziworski, Jan Lukavsky, Jason
Gustafson, Jay Kreps, Jeff Widman, Jeyhun Karimov, Jiangjie Qin, Joel
Koshy, Jon Freedman, Joshi, Jozef Koval, Json Tu, Jun He, Jun Rao,
Kamal, Kamal C, Kamil Szymanski, Kim Christensen, Kiran Pillarisetty,
Konstantine Karantasis, Lihua Xin, LoneRifle, Magnus Edenhill, Magnus
Reftel, Manikumar Reddy O, Mark Rose, Mathieu Fenniak, Matthias J. Sax,
Mayuresh Gharat, MayureshGharat, Michael Schiff, Mickael Maison,
MURAKAMI Masahiko, Nikki Thean, Olivier Girardot, pengwei-li, pilo,
Prabhat Kashyap, Qian Zheng, Radai Rosenblatt, radai-rosenblatt, Raghav
Kumar Gautam, Rajini Sivaram, Rekha Joshi, rnpridgeon, Ryan Pridgeon,
Sandesh K, Scott Ferguson, Shikhar Bhushan, steve, Stig Rohde Døssing,
Sumant Tambe, Sumit Arrawatia, Theo, Tim Carey-Smith, Tu Yang, Vahid
Hashemian, wangzzu, Will Marshall, Xavier Léauté, Xavier Léauté, Xi Hu,
Yang Wei, yaojuncn, Yuto Kawamura

We welcome your help and feedback. For more information on how to
report problems, and to get involved, visit the project website at
http://kafka.apache.org/

Thanks,
Ewen


Re: [VOTE] 0.10.2.0 RC2

2017-02-18 Thread Ewen Cheslack-Postava
This vote passes with 12 +1 votes (4 bindings) and no 0 or -1 votes.

+1 votes
PMC Members:
* Guozhang Wang
* Jun Rao
* Gwen Shapira
* Neha Narkhede

Committers:
* Ismael Juma
* Sriram Subramanian

Community:
* Tom Crayford
* Magnus Edenhill
* Mathieu Fenniak
* Rajini Sivaram
* Manikumar Reddy
* Vahid Hashemian

0 votes
* No votes

-1 votes
* No votes

I'll continue with the release process and the release announcement will
follow in the next few days.

-Ewen

On Thu, Feb 16, 2017 at 10:26 AM, Vahid S Hashemian <
vahidhashem...@us.ibm.com> wrote:

> +1 (non-binding)
>
> Built from the source and ran the quickstart successfully on Ubuntu, Mac,
> Windows (64 bit).
>
> Thank you Ewen for running the release.
>
> --Vahid
>
>
>
> From:   Ewen Cheslack-Postava <e...@confluent.io>
> To: d...@kafka.apache.org, "users@kafka.apache.org"
> <users@kafka.apache.org>, "kafka-clie...@googlegroups.com"
> <kafka-clie...@googlegroups.com>
> Date:   02/14/2017 10:40 AM
> Subject:[VOTE] 0.10.2.0 RC2
>
>
>
> Hello Kafka users, developers and client-developers,
>
> This is the third candidate for release of Apache Kafka 0.10.2.0.
>
> This is a minor version release of Apache Kafka. It includes 19 new KIPs.
> See the release notes and release plan (https://cwiki.apache.org/conf
> luence/display/KAFKA/Release+Plan+0.10.2.0) for more details. A few
> feature
> highlights: SASL-SCRAM support, improved client compatibility to allow use
> of clients newer than the broker, session windows and global tables in the
> Kafka Streams API, single message transforms in the Kafka Connect
> framework.
>
> Important note: in addition to the artifacts generated using JDK7 for
> Scala
> 2.10 and 2.11, this release also includes experimental artifacts built
> using JDK8 for Scala 2.12.
>
> Important code changes since RC1 (non-docs, non system tests):
>
> KAFKA-4756; The auto-generated broker id should be passed to
> MetricReporter.configure
> KAFKA-4761; Fix producer regression handling small or zero batch size
>
> Release notes for the 0.10.2.0 release:
> http://home.apache.org/~ewencp/kafka-0.10.2.0-rc2/RELEASE_NOTES.html
>
> *** Please download, test and vote by February 17th 5pm ***
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> http://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> http://home.apache.org/~ewencp/kafka-0.10.2.0-rc2/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/
>
> * Javadoc:
> http://home.apache.org/~ewencp/kafka-0.10.2.0-rc2/javadoc/
>
> * Tag to be voted upon (off 0.10.2 branch) is the 0.10.2.0 tag:
> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
> 5712b489038b71ed8d5a679856d1dfaa925eadc1
>
>
>
> * Documentation:
> http://kafka.apache.org/0102/documentation.html
>
> * Protocol:
> http://kafka.apache.org/0102/protocol.html
>
> * Successful Jenkins builds for the 0.10.2 branch:
> Unit/integration tests:
> https://builds.apache.org/job/kafka-0.10.2-jdk7/77/
> System tests:
> https://jenkins.confluent.io/job/system-test-kafka-0.10.2/29/
>
> /**
>
> Thanks,
> Ewen
>
>
>
>
>


[VOTE] 0.10.2.0 RC2

2017-02-14 Thread Ewen Cheslack-Postava
Hello Kafka users, developers and client-developers,

This is the third candidate for release of Apache Kafka 0.10.2.0.

This is a minor version release of Apache Kafka. It includes 19 new KIPs.
See the release notes and release plan (https://cwiki.apache.org/conf
luence/display/KAFKA/Release+Plan+0.10.2.0) for more details. A few feature
highlights: SASL-SCRAM support, improved client compatibility to allow use
of clients newer than the broker, session windows and global tables in the
Kafka Streams API, single message transforms in the Kafka Connect framework.

Important note: in addition to the artifacts generated using JDK7 for Scala
2.10 and 2.11, this release also includes experimental artifacts built
using JDK8 for Scala 2.12.

Important code changes since RC1 (non-docs, non system tests):

KAFKA-4756; The auto-generated broker id should be passed to
MetricReporter.configure
KAFKA-4761; Fix producer regression handling small or zero batch size

Release notes for the 0.10.2.0 release:
http://home.apache.org/~ewencp/kafka-0.10.2.0-rc2/RELEASE_NOTES.html

*** Please download, test and vote by February 17th 5pm ***

Kafka's KEYS file containing PGP keys we use to sign the release:
http://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
http://home.apache.org/~ewencp/kafka-0.10.2.0-rc2/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/

* Javadoc:
http://home.apache.org/~ewencp/kafka-0.10.2.0-rc2/javadoc/

* Tag to be voted upon (off 0.10.2 branch) is the 0.10.2.0 tag:
https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=5712b489038b71ed8d5a679856d1dfaa925eadc1


* Documentation:
http://kafka.apache.org/0102/documentation.html

* Protocol:
http://kafka.apache.org/0102/protocol.html

* Successful Jenkins builds for the 0.10.2 branch:
Unit/integration tests: https://builds.apache.org/job/kafka-0.10.2-jdk7/77/
System tests: https://jenkins.confluent.io/job/system-test-kafka-0.10.2/29/

/**

Thanks,
Ewen


[VOTE] 0.10.2.0 RC1

2017-02-10 Thread Ewen Cheslack-Postava
Hello Kafka users, developers and client-developers,

This is RC1 for release of Apache Kafka 0.10.2.0.

This is a minor version release of Apache Kafka. It includes 19 new KIPs.
See the release notes and release plan (https://cwiki.apache.org/
confluence/display/KAFKA/Release+Plan+0.10.2.0) for more details. A few
feature highlights: SASL-SCRAM support, improved client compatibility to
allow use of clients newer than the broker, session windows and global
tables in the Kafka Streams API, single message transforms in the Kafka
Connect framework.

Important note: in addition to the artifacts generated using JDK7 for Scala
2.10 and 2.11, this release also includes experimental artifacts built
using JDK8 for Scala 2.12.

Important code changes since RC0 (non-docs, non system tests):

* KAFKA-4728; KafkaConsumer#commitSync should copy its input
* KAFKA-4441; Monitoring incorrect during topic creation and deletion
* KAFKA-4734; Trim the time index on old segments
* KAFKA-4725; Stop leaking messages in produce request body when requests
are delayed
* KAFKA-4716: Fix case when controller cannot be reached

Release notes for the 0.10.2.0 release:
http://home.apache.org/~ewencp/kafka-0.10.2.0-rc1/RELEASE_NOTES.html

*** Please download, test and vote by Monday, Feb 13, 5pm PT ***

Kafka's KEYS file containing PGP keys we use to sign the release:
http://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
http://home.apache.org/~ewencp/kafka-0.10.2.0-rc1/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/

* Javadoc:
http://home.apache.org/~ewencp/kafka-0.10.2.0-rc1/javadoc/

* Tag to be voted upon (off 0.10.2 branch) is the 0.10.2.0 tag:
https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=e825b7994bf8c8c4871d1e0973e287e6d31c7ae4


* Documentation:
http://kafka.apache.org/0102/documentation.html

* Protocol:
http://kafka.apache.org/0102/protocol.html

* Successful Jenkins builds for the 0.10.2 branch:
Unit/integration tests: https://builds.apache.org/job/kafka-0.10.2-jdk7/74/
System tests: https://jenkins.confluent.io/job/system-test-kafka-0.10.2/25/

/**

Thanks,
Ewen


Re: When publishing to non existing topic, TimeoutException is thrown instead of UnknownTopicOrPartitionException

2017-01-30 Thread Ewen Cheslack-Postava
Stevo,

Agreed that this seems broken if we're just timing out trying to fetch
metadata if we should be able to tell that the topic will never be created.

Clients can't explicitly tell whether auto topic creation is on. Implicit
indication via the error code seems like a good idea. My only concern is
what happens in clusters where that setting is mixed (whether because it is
being turned on (I hope not) or turned off (I hope so, but I want things to
behave nicely)).

-Ewen

On Fri, Jan 27, 2017 at 5:43 AM, Stevo Slavić  wrote:

> Found somewhat related ticket https://issues.apache.org/
> jira/browse/KAFKA-4385
>
> On Fri, Jan 27, 2017 at 1:09 PM, Stevo Slavić  wrote:
>
> > Hello Apache Kafka community,
> >
> > When using new async KafkaProducer, one can register callback with send
> > calls.
> >
> > With auto.create.topics.enable set to false, when I try to publish to non
> > existing topic, I expect callback to complete with
> > UnknownTopicOrPartitionException. Instead, I get back
> > "org.apache.kafka.common.errors.TimeoutException: Failed to update
> > metadata after..."
> >
> > When topic doesn't truly exist (so metadata request reached broker and
> > response included error that topic is unknown) I would like to handle
> that
> > case differently, than when there are e.g. networking problems like when
> > metadata response was not received on time and timeout exception is
> > appropriate.
> >
> > I've reproduced this unwanted behavior with both Kafka 0.9.0.1 and
> 0.10.1.1
> >
> > Is this a feature or a bug? If feature, would it make sense to improve it
> > in this case to throw UnknownTopicOrPartitionException instead?
> >
> > Kind regards,
> > Stevo Slavic.
> >
>


Re: kafka_2.10-0.8.1 simple consumer retrieves junk data in the message

2017-01-30 Thread Ewen Cheslack-Postava
What are the 26 additional bytes? That sounds like a header that a
decoder/deserializer is handling with the high level consumer. What class
are you using to deserialize the messages with the high level consumer?

-Ewen

On Fri, Jan 27, 2017 at 10:19 AM, Anjani Gupta 
wrote:

> I am using kafka_2.10-0.8.1 and trying to fetch messages using Simple
> Consumer API. I notice that byte array for message retrieved has 26 junk
> bytes appended at the beginning  of original message sent by producer. Any
> idea what's going on here? This works fine with High level consumer.
>
> This is how my code looks like -
>
> TopicAndPartition topicAndPartition = new TopicAndPartition(topic,
> partition);
> OffsetFetchResponse offsetFetchResponse = consumer.fetchOffsets(new
> OffsetFetchRequest(GROUP_ID,
> Collections.singletonList(topicAndPartition), (short) 0,
> 0,
> CLIENT_ID));
>
> //Fetch messages from Kafka.
> FetchRequest req = new FetchRequestBuilder()
> .clientId(CLIENT_ID)
> .addFetch(topic, partition, readOffset, 1000)
> .build();
> FetchResponse fetchResponse = consumer.fetch(req);
> for (MessageAndOffset messageAndOffset :
> fetchResponse.messageSet(topicName, partition)) {
> byte[] message = messageAndOffset.message().payload().array();
>
> }
>
> Here message has additional 26 bytes appended to beginning of array.
>
>
> Thanks,
> Anjani
>


Re: Upgrade questions

2017-01-30 Thread Ewen Cheslack-Postava
Note that the documentation that you linked to for upgrades specifically
lists configs that you need to be careful to adjust in your
server.properties.

In fact, the server.properties shipped with Kafka is meant for testing
only. There are some configs in the example server.properties that are not
safe to use in production, for example log.dirs, which I see you've
adjusted.

That said, we are careful about not changing default values in the actual
service without warning. If we change them, it would be during major
version changes and we would still warn you about it in the release notes.
You are making a big jump (0.8.2-beta to 0.10.x) so you'd want to check the
0.8.2, 0.9, and 0.10 release notes. The most obvious thing I'd expect to be
an issue would be the log format changes and inter protocol version
settings. These are settings you need to be careful about setting correctly
during upgrades and are clearly described in the docs you already linked:
http://kafka.apache.org/0100/documentation.html#upgrade

-Ewen

On Thu, Jan 26, 2017 at 3:22 PM, Fernando Vega 
wrote:

> I have a few questions regarding an upgrade that Im attempting to perform.
>
> - Currently we are running version 0.8.2_beta.
> - We are trying to upgrade to 10.1.1
> - Our setup uses the following path
>/server/kafka
>where kafka is a symlink to kafka-{version}
> - I attempt to perform the upgrade as you guys specify in the
> documentation:
> http://kafka.apache.org/0100/documentation.html#upgrade
>
> However when I attempt this my server.properties file that was in
> kafka-0.8.2_beta was gone and a new file was under kafka-0.10.1.1.
>
> I started the server without noticing this at first and it started
> normally.
> When I realize about it I stop it and copy the server.properties file from
> another host, running the old version into this server in the new directory
> and adding the 2 lines mentioned in the instructions above as well as
> changing its broker id of course, when I do this the server does not start.
>
> So my questions are:
>
> Are any of the config files different from one version to another?
> Does the options from one version to another are the same?
>
> This is a sample of the file that the old cluster uses and I will like to
> know if this options are still valid for the new version:
>
> ## See http://kafka.apache.org/documentation.html#brokerconfigs for
> default
> values.
>
> # The id of the broker. This must be set to a unique integer for each
> broker.
> broker.id=6
>
> # The port the socket server listens on
> port=9092
>
> # A comma seperated list of directories under which to store log files
> log.dirs=/data/1/kafka/datalog,/data/2/kafka/datalog,/data/
> 3/kafka/datalog,/data/4/kafka/datalog,/data/5/kafka/datalog,
> /data/6/kafka/datalog,/data/7/kafka/datalog,/data/8/kafka/datalog
>
> # Zookeeper connection string (see zookeeper docs for details).
> # This is a comma separated host:port pairs, each corresponding to a zk
> # server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
> # You can also append an optional chroot string to the urls to specify the
> # root directory for all kafka znodes.
>
> zookeeper.connect=dwh-pipeline001:2181,dwh-pipeline002:2181,
> dwh-pipeline003:2181,dwh-pipeline004:2181,dwh-pipeline005:2181/kafka/data
>
> # Additional configuration options may follow here
> auto.leader.rebalance.enable=true
> delete.topic.enable=true
> log.retention.size=200
> socket.receive.buffer.bytes=1048576
> socket.send.buffer.bytes=1048576
> default.replication.factor=2
> auto.create.topics.enable=true
> num.partitions=100
> num.network.threads=3
> num.io.threads=40
> log.retention.hours=24
> log.roll.hours=1
> num.replica.fetchers=8
> zookeeper.connection.timeout.ms=3
> zookeeper.session.timeout.ms=3
>
> Also What are the steps or ways I can check the update work fine and that
> Im running x or y version or to determine things are working fine
>
> Thank you
>
> [image: Turn] 
>
> *Fernando Vega*
> Sr. Operations Engineer
> *cell* (415) 810-0242
> 901 Marshall St, Suite 200, Redwood City, CA 94063
>
>
> turn.com    |   @TurnPlatform
> 
>
> This message is Turn Confidential, except for information included that is
> already available to the public. If this message was sent to you
> accidentally, please delete it.
>


Re: Kafka JDBC connector vs Sqoop

2017-01-30 Thread Ewen Cheslack-Postava
For MySQL you would either want to use Debezium's connector (which can
handle bulk dump + incremental CDC, but requires direct access to the
binlog) or the JDBC connector (does an initial bulk dump + incremental
queries, but has limitations compared to a "true" CDC solution).

Sqoop and the JDBC connector will have largely similar limitations since
they do not look at the database's transaction logs directly. They also get
similar benefits as a result: don't need the same access rights, don't need
to be colocated, etc. A direct CDC connector gets the opposite set of
tradeoffs -- needs direct colocation (though it could be on a replica), but
also gets direct access to transactions, doesn't end up with limitations on
transaction duration as described in http://docs.confluent.io/3.
1.2/connect/connect-jdbc/docs/source_connector.html#incremental-query-modes,
etc.

-Ewen

On Thu, Jan 26, 2017 at 12:43 PM, Buntu Dev  wrote:

> I'm looking for ways to bulk/incremental import from MySQL database to
> HDFS. Currently I got Sqoop that does the bulk import creating a Hive
> table.
>
> Wanted to know the pros/cons of using JDBC connector instead of Sqoop and
> are there any MySQL config changes expected (like binlog configuration in
> the case of CDC connectors) to import insert/alter/delete statement.
>
>
> Thanks!
>


Re: using kafka log compaction withour key

2017-01-30 Thread Ewen Cheslack-Postava
The log compaction functionality uses the key to determine which records to
deduplicate. You can think of it (very roughly) as deleting entries from a
hash map as the value for each key is overwritten. This functionality
doesn't have much of a point unless you include keys in your records.

-Ewen

On Thu, Jan 26, 2017 at 5:28 AM, Samy CHBINOU 
wrote:

> Hello,
>
> Is is possible to use log compaction without key? I think in that case
> buffer will contain only one line of data value? Is that correct?
>
> thanks
>
>


Re: special characters in kafka log

2017-01-30 Thread Ewen Cheslack-Postava
Not sure what special characters you are referring to, but for data in the
key and value fields in Kafka, it handles arbitrary binary data. "Special
characters" aren't special because Kafka doesn't even inspect the data it
is handling: clients tell it the length of the data and then it copies that
number of bytes.

Serializers for different clients could potentially run into problems, but
all popular clients/serializers in use have been heavily tested and
validated.

-Ewen

On Tue, Jan 24, 2017 at 11:25 PM, Laxmi Narayan NIT DGP <
nit.dgp...@gmail.com> wrote:

> Hi ,
>
> Have anybody tested spcl char inside kafka ?
>
> I am little worried about serialization and de-serialization of special
> characters.
>
>
> *​*
>
> *Regards,*
> *Laxmi Narayan Patel*
> *MCA NIT Durgapur (2011-2014)*
> *Mob:-9741292048,8345847473*
>


Re: kafka connect architecture

2017-01-30 Thread Ewen Cheslack-Postava
On Mon, Jan 30, 2017 at 8:24 AM, Koert Kuipers  wrote:

> i have been playing with kafka connect in standalone and distributed mode.
>
> i like standalone because:
> * i get to configure it using a file. this is easy for automated deployment
> (chef, puppet, etc.). configuration using a rest api i find inconvenient.
>

What exactly is inconvenient? The orchestration tools you mention all have
built-in tooling to make REST requests. In fact, you could pretty easily
take a config file you could use with standalone mode and convert it into
the JSON payload for the REST API and simply make that request. If the
connector already exists with the same config, it shouldn't have any effect
on the cluster -- it's just a noop re-registration.


> * erors show up in log files instead of having to retrieve them using a
> rest api. same argument as previous bullet point really. i know how to
> automate log monitoring. rest api isnt great for this.
>

If you run in distributed mode, you probably also want to collect log files
somehow. The errors still show up in log files, they are just spread across
multiple nodes so you may need to collect them to put them in a central
location. (Hint: connect can do this :))


> * isolation of connector classes. every connector has its own jvm. no jar
> dependency hell.
>

Yup, this is definitely a pain point. We're looking into classpath
isolation in a subsequent release (won't be in AK 0.10.2.0/CP 3.2.0, but I
am hoping it will be in AK 0.10.3.0/CP3.3.0).


>
> i like distributed because:
> * well its fault tolerant and can distribute workload
>
> so this makes me wonder... how hard would it be to get the
> "connect-standalone" setup where each connector has its own service(s),
> configuration is done using files, and errors are written to logs, yet at
> the same time i can spin up multiple services for a connector and they form
> a group? and while we are at it also remove the rest api entirely, since i
> dont need it, it poses a security risk, and it makes it hard to spin up
> multiple connectors on same box. with such a setup i could simply deploy as
> many services as i need for a connector, using either chef, or perhaps
> slider on yarn, or whatever framework i need.
>

A distributed mode driven by config files is possible and something that's
been brought up before, although does have some complicating factors. Doing
a rolling bounce of such a service gets tricky in the face of failures as
you might have old & new versions of the app starting simultaneously (i.e.
it becomes difficult to figure out which config to trust).

As to removing the REST API in some cases, I guess I could imagine doing
it, but in practice you should probably just lock down access by never
allowing access to that port. If you're worried about security, you should
have all ports disabled by default; if you don't want to provide access to
the REST API, simply don't enable access to it.

-Ewen


>
> this is related to KAFKA-3815
>  which makes similar
> arguments for container deployments
>


Re: Ideal value for Kafka Connect Distributed tasks.max configuration setting?

2017-01-30 Thread Ewen Cheslack-Postava
On Fri, Jan 27, 2017 at 10:49 AM, Phillip Mann  wrote:

> I am looking to product ionize and deploy my Kafka Connect application.
> However, there are two questions I have about the tasks.max setting which
> is required and of high importance but details are vague for what to
> actually set this value to.
>
> My simplest question then is as follows: If I have a topic with n
> partitions that I wish to consume data from and write to some sink (in my
> case, I am writing to S3), what should I set tasks.max to? Should I set it
> to n? Should I set it to 2n? Intuitively it seems that I'd want to set the
> value to n and that's what I've been doing.
>

For sink connectors, you cannot get any more parallelism than the # of
topic partitions. While you can set the tasks.max larger than that, it will
not help.

However, you don't need to set tasks.max to the number of topic partitions.
tasks.max maps directly to the # of threads you have processing data. If
the throughput for a topic is low, you might want to set this to a small
number, possibly even 1. If the throughput for each topic partition is
high, you might need n tasks just to keep up.

It's hard to give a definitive answer here because the answer is really
just that "it depends". You can do some amount of capacity planning ahead
of time, but the best approach is to monitor how far you are lagging behind
the input data. If there is lag, increase the # of tasks. If you're not
even utilizing the resources you provided, perhaps you can scale back.


>
> What if I change my Kafka topic and increase partitions on the topic? I
> will have to pause my Kafka Connector and increase the tasks.max if I set
> it to n? If I have set a value of 2n, then my connector should
> automatically increase the parallelism it operates?
>

You don't need to explicitly pause the connector. You can reset a
configuration dynamically and the cluster will sort out the pause/restart
process automatically.

-Ewen


>
> Thanks for your help!
>


Re: Automatic Offset Committing

2017-01-23 Thread Ewen Cheslack-Postava
We plan to, but there is quite a bit of functionality that needs to be
abstracted by requests to the broker. Most of the functionality in the
topics command that interacts directly with ZK will be replaced by KIP-4
protocols (
https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations)
and gradually transitioned over. Just as with the new clients, this
implementation and transition will take some time.

-Ewen

On Mon, Jan 23, 2017 at 10:37 PM, Stephane Maarek <
steph...@simplemachines.com.au> wrote:

> Hi Ewen
>
> If the trend is to hide zookeeper entirely (and most likely restricting its
> network connection to Kafka only ) would it make sense to update the Kafka
> topics tool ? Currently it is
>
> > *bin/kafka-topics.sh --create --zookeeper localhost:2181
> --replication-factor 1 --partitions 1 --topic test*
>
>
> On 24 Jan. 2017 3:25 pm, "Ewen Cheslack-Postava" <e...@confluent.io>
> wrote:
>
> > The new consumer only supports committing offsets to Kafka. (It doesn't
> > even have connection info to ZooKeeper, which is a general trend in Kafka
> > clients -- all details of ZooKeeper are being hidden away from clients,
> > even administrative functions like creating topics.)
> >
> > -Ewen
> >
> > On Thu, Jan 19, 2017 at 7:00 PM, vinay ng <vinay.ng@gmail.com>
> wrote:
> >
> > > Hi,
> > >
> > > As per Kafka documentation,
> > >
> > > *Automatic Offset Committing,*
> > > "Setting enable.auto.commit means that offsets are committed
> > > automatically."
> > >
> > > Can you help me understand where the offsets are stored in case of
> > > automatic offset committing.. I believe it used to be on Zookeeper but
> > not
> > > anymore as per new KafkaConsumer API.
> > >
> > >
> > > Thanks
> > >
> >
>


Re: Does offsetsForTimes use createtime of logsegment file?

2017-01-23 Thread Ewen Cheslack-Postava
The broker still accepts that version and the Scala API still includes
support for that timestamp. Note that the way this worked in previous
versions was by looking only at the timestamp for each log segment instead
of using a timestamp index within the segment.

Note that the new consumer now includes an offsetsForTimes method. Using
that would be the recommended path forward, especially if you are writing
new code and are working with recent enough versions of Kafka.

-Ewen

On Thu, Jan 19, 2017 at 10:02 AM, Vignesh <vignesh.v...@gmail.com> wrote:

> Another question, with getOffsetsBefore, we used to be able to get offsets
> for time in older versions.
> .10 doesn't have an equivalent method.
>
> Is there any other way to achieve the same functionality as
> getOffsetsBefore
> in .10 ? Does a .10 server respond to ListOffsetRequestV0 request?
>
>
> On Fri, Jan 6, 2017 at 1:26 PM, Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
> > It would return the earlier one, offset 0.
> >
> > -Ewen
> >
> > On Thu, Jan 5, 2017 at 10:15 PM, Vignesh <vignesh.v...@gmail.com> wrote:
> >
> > > Thanks. I didn't realize ListOffsetRequestV1 is only available 0.10.1
> > > (which has KIP-33, time index).
> > > When timestamp is set by user (CreationTime), and it is not always
> > > increasing, would this method still return the offset of first message
> > with
> > > timestamp greater than equal to the provided timestamp?
> > >
> > >
> > > For example, in below scenario
> > >
> > > Message1, Timestamp = T1, Offset = 0
> > > Message2, Timestamp = T0 (or T2), Offset = 1
> > > Message3, Timestamp = T1, Offset = 2
> > >
> > >
> > > Would offsetForTimestamp(T1) return offset for earliest message with
> > > timestamp T1 (i.e. Offset 0 in above example) ?
> > >
> > >
> > > -Vignesh.
> > >
> > > On Thu, Jan 5, 2017 at 8:19 PM, Ewen Cheslack-Postava <
> e...@confluent.io
> > >
> > > wrote:
> > >
> > > > On Wed, Jan 4, 2017 at 11:54 PM, Vignesh <vignesh.v...@gmail.com>
> > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > offsetsForTimes
> > > > > <https://kafka.apache.org/0101/javadoc/org/apache/kafka/
> > > > clients/consumer/
> > > > > KafkaConsumer.html#offsetsForTimes(java.util.Map)>
> > > > > function
> > > > > returns offset for a given timestamp. Does it use message's
> timestamp
> > > > > (which could be LogAppendTime or set by user) or creation time of
> > > > > logsegment file?
> > > > >
> > > > >
> > > > This is actually tied to how the ListOffsetsRequest is handled. But
> if
> > > > you're on a recent version, then the KIP
> > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > action?pageId=65868090
> > > > made it use the more accurate version based on message timestamps.
> > > >
> > > >
> > > > >
> > > > > KIP-33
> > > > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > 33+-+Add+a+time+based+log+index>
> > > > > adds timestamp based index, and it is available only from 0.10.1 .
> > Does
> > > > >  above function work on 0.10.0 ? If so, are there any differences
> in
> > > how
> > > > it
> > > > > works between versions 0.10.0 and 0.10.1 ?
> > > > >
> > > > >
> > > > The KIP was only adopted and implemented in 0.10.1+. It is not
> > available
> > > in
> > > > 0.10.0.
> > > >
> > > >
> > > > > Thanks,
> > > > > Vignesh.
> > > > >
> > > >
> > >
> >
>


Re: Kafka Protocol : about clients and number of TCP connections

2017-01-23 Thread Ewen Cheslack-Postava
The only other connections to brokers would be to the bootstrap brokers in
order to collect cluster metadata.

-Ewen

On Wed, Jan 18, 2017 at 3:48 AM, Paolo Patierno  wrote:

> Hello,
>
>
> I'd like to know the number of connections that Kafka clients establish. I
> mean ...
>
>
> The producer establishes a TCP connection for each "leader" broker which
> has the partition where the producer itself wants to send messages.
>
> Is there any other connection to consider ?
>
>
> The consumer establishes a TCP connection for each "leader" broker which
> has the assigned partition (or requested by client itself) for reading.
>
> Other than that I understood that consumer has a TCP connection even with
> the coordinator for polling information about group changes and partitions
> will be re-balanced.
>
> Is there any other connection to consider ?
>
>
> Thanks,
>
> Paolo
>
>
> Paolo Patierno
> Senior Software Engineer (IoT) @ Red Hat
> Microsoft MVP on Windows Embedded & IoT
> Microsoft Azure Advisor
>
> Twitter : @ppatierno
> Linkedin : paolopatierno
> Blog : DevExperience
>


Re: min hardware requirement

2017-01-23 Thread Ewen Cheslack-Postava
Smaller servers/instances work fine for tests, as long as the workload is
scaled down as well. Most memory on a Kafka broker will end up dedicated to
page cache. For, e.g., 1GB of RAM just consider that you probably won't be
leaving much room to cache the data so your performance may suffer a bit.

-Ewen

On Fri, Jan 20, 2017 at 7:28 PM, Laxmi Narayan NIT DGP  wrote:

> Hi ,
> what is min hardware requirement for kafka ?
>
> I see min ram size for production is recommended is 32GB.
>
> what can be issue with 8 GB ram and for test purpose i was planning to use
>
> some 1gb or 4gb aws machine, is it safe to run in 1gb machine for few days?
>
> For log write i have big disk.
>
>
> *Regards,*
> *Laxmi Narayan Patel*
> *MCA NIT Durgapur (2011-2014)*
> *Mob:-9741292048,8345847473*
>


Re: Automatic Offset Committing

2017-01-23 Thread Ewen Cheslack-Postava
The new consumer only supports committing offsets to Kafka. (It doesn't
even have connection info to ZooKeeper, which is a general trend in Kafka
clients -- all details of ZooKeeper are being hidden away from clients,
even administrative functions like creating topics.)

-Ewen

On Thu, Jan 19, 2017 at 7:00 PM, vinay ng  wrote:

> Hi,
>
> As per Kafka documentation,
>
> *Automatic Offset Committing,*
> "Setting enable.auto.commit means that offsets are committed
> automatically."
>
> Can you help me understand where the offsets are stored in case of
> automatic offset committing.. I believe it used to be on Zookeeper but not
> anymore as per new KafkaConsumer API.
>
>
> Thanks
>


Re: Query on MirrorMaker Replication - Bi-directional/Failover replication

2017-01-23 Thread Ewen Cheslack-Postava
On Wed, Jan 18, 2017 at 4:56 PM, Greenhorn Techie <greenhorntec...@gmail.com
> wrote:

> Hi there,
>
> Can anyone please answer to my below follow-up questions for Ewen's
> responses.
>
> Thanks
>
>
> On Tue, 17 Jan 2017 at 00:28 Greenhorn Techie <greenhorntec...@gmail.com>
> wrote:
>
>> Thanks Ewen for the detailed response. This is quite helpful and cleared
>> some of my doubts. However, I do have some follow-up queries. Can you
>> please let me know your thoughts on the same?
>>
>> [Query] Is non-compacted topics a pre-requisite to have this mechanism
>> work as expected? What are the challenges that need to be looked for in
>> case of compacted-topics?
>>
>
There are additional considerations when you are using compacted topics.
The core of the problem is that offsets will differ between either side.
That can be true of non-compacted topics as well, but at least they have a
consistent offset you can rely on (up to lagging too far behind and losing
data). In a compacted topic, the difference between offsets in source and
sink could change frequently depending on a number of factors including
lag, segment size, etc.


> 1. Use MM normally to replicate your data. Be *very* sure you construct
>> your setup to ensure *everything* is mirrored (proper # of partitions,
>> replication factor, topic level configs, etc). (Note that this is something
>> the Confluent replication solution is addressing that's a significant
>> gap in MM.)
>>
> [Query] Here we are planning to use MirrorMaker to do the job for us and
>> hence topic creation etc is expected to be created by MirrorMaker (by
>> setting auto.create.topics.enable=true) Will this work? or will setting
>> auto.create.topics.enable=true create the topic with default settings?
>>
>
MirrorMaker doesn't do this today. This is one of the features Confluent's
Replicator provides, including mirroring all topic level configs. If you
use MM you'll need to find a way to handle that yourself, either by not
allowing MM to replicate data on a topic until you know the destination
topic has been created or using one of MM pluggable components to do so.


> 2. During replication, be sure to record offset deltas for every
>> topic partition. These are needed to reverse the direction of
>> replication correctly. Make sure to store them in the backup DC and
>> somewhere very reliable.
>> [Query] Is there any recommended approach to do this? As I am new to
>> Kafka, wondering if there is a good way of doing this
>>
>
No recommended way. If replication is paused you can just record the high
watermark on both sides.


> 4. Decide to do failover. Ensure replication has actually stopped (via
>> your own tooling, or probably better, by using ACLs to ensure no new data
>> can be
>> produced from original DC to backup DC)
>> [Query] Does stopping replication mean killing the MirrorMaker process?
>> Or is there more needed here? Using ACLs probably, we can ensure the mirror
>> maker service account doesn't have read access on the source cluster and
>> write access on the DR cluster. Is there anything else to be done here?
>>
>
At its core, yes, it means stopping MM. But as you say, you can get fancier
and enforce this with, eg. ACLs.


> 5. Record all the high watermarks for every topic partition so you
>> know which data was replicated from the original DC (vs which is new
>> after failover).
>> [Query] Is there any best practice around this? In the presentation Jun
>> Rao talks about time-stamp based offset recording. As I understand, that
>> would probably help our case, where we can probably produce messages to the
>> DR cluster, from the point of failover
>>
>
Note that the timestamp based offset doesn't give you the perfect reversal
of mirroring that you are looking for. The granularity of timestamps can't
possibly guarantee that (let alone timestamp ordering issues).

Timestamp-based reset in the case of a failover is a good option today
assuming you are on 0.10.1+ brokers.


> 7. Once the original DC is back alive, you want to reverse replication
>> and make it the backup. Lookup the offset deltas, use them to
>> initialize offsets for the consumer group you'll use to do replication.
>> [Query]In order to lookup the offset deltas before initiating the
>> consumers on the original cluster, is there any recommended
>> mechanism/tooling that can be leveraged?
>>
>
There isn't tooling for this, and the intent in this step is to leverage
the deltas you recorded in an earlier step. You'd probably want to write
one tool that handles both of those steps since the output of one step is
the input to the other.

-Ewen


>
>> Best Reg

Re: Kafka Connect: Paused connector but still processing data

2017-01-23 Thread Ewen Cheslack-Postava
There was this issue: https://issues.apache.org/jira/browse/KAFKA-4527
which was a test failure that had to do with updating the status as soon as
the request to pause the connector was received rather than after it was
processed. The corresponding PR fixed that (and will be released in
0.10.2.0).

If your connector is not properly responding to requests to stop, it's
possible you would see the connector continue to process data incorrectly.
If it is receiving the request to stop processing data (source connectors
will be stopped, sink connectors will simply pause consumption), then it
sounds like you found a way to trigger a bug. If you have a set of steps to
reproduce the issue, a bug report would be appreciated.

-Ewen

On Mon, Jan 16, 2017 at 11:01 PM, Stephane Maarek <
steph...@simplemachines.com.au> wrote:

> Hi,
>
> I have paused my connector yet it’s still very much active and processing
> data. I can see because the offset lag keeps on decreasing (it got 100M
> messages to read yet).
> Is such a bug known? When I get the status it says PAUSED
>
> (It is a custom connector, but I don’t think I’ve implemented anything in
> an odd way)
>
> Thanks!
> Stephane
>


Re: Kafka Connect requestTaskReconfiguration

2017-01-15 Thread Ewen Cheslack-Postava
This is currently expected. Internally the Connect cluster uses the same
rebalancing process as consumer groups which means it has similar
limitations -- all tasks must stop just as you would need to stop consuming
from all partitions and commit offsets during a consumer group rebalance.

There's an issue filed about this: https://issues.apache.org/
jira/browse/KAFKA-3351 and it's something we're aware eventually becomes a
scalability limitations. There are some ideas about how to avoid this, but
nothing concrete on the roadmap yet.

-Ewen

On Fri, Jan 13, 2017 at 10:32 AM, Mathieu Fenniak <
mathieu.fenn...@replicon.com> wrote:

> Hey kafka-users,
>
> Is it normal for a Kafka Connect source connector that
> calls requestTaskReconfiguration to cause all the connectors on the
> kafka-connect distributed system to be stopped and started?
>
> One of my three connectors (2x source, 1x sink) runs a background thread
> that will occasionally invoke `context.requestTaskReconfiguration()` when
> it detects new work that it wants to distribute to its tasks.  When this
> occurs, I observe all three of the connectors stop and start.  One of the
> other connectors doesn't have the smoothest stop/start cycle, so I'm hoping
> that this might be avoidable?
>
> I'm running on Kafka Connect 0.10.1.0.
>
> Mathieu
>


Re: java.lang.OutOfMemoryError: Java heap space while running kafka-consumer-perf-test.sh

2017-01-13 Thread Ewen Cheslack-Postava
Perhaps the default heap options aren't sufficient for your particular use
of the tool. The consumer perf test defaults to 512MB. I'd simply try
increasing the max heap usage: KAFKA_HEAP_OPTS="-Xmx1024M" to bump it up a
bit.

-Ewen

On Wed, Jan 11, 2017 at 2:59 PM, Check Peck  wrote:

> I am trying to run kafka performance script on my linux box. Whenever I run
> "kafka-consumer-perf-test.sh", I always get an error. In the same box, I am
> running "kafka-producer-perf-test.sh" as well and that is not failing at
> all. Looks like something is wrong with "kafka-consumer-perf-test.sh".
>
> I am running Kafka version 0.10.1.0.
>
> Command I ran:
> ./bin/kafka-consumer-perf-test.sh --zookeeper 110.27.14.10:2181 --messages
> 50 --topic test-topic --threads 1
>
> Error I got:
> [2017-01-11 22:34:09,785] WARN [ConsumerFetcherThread-perf-
> consumer-14195_kafka-cluster-3098529006 <(309)%20852-9006>
> -zeidk-1484174043509-46a51434-2-0], Error in fetch kafka.consumer.
> ConsumerFetcherThread$FetchRequest@54fb48b6 (kafka.consumer.
> ConsumerFetcherThread)
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at org.apache.kafka.common.network.NetworkReceive.
> readFromReadableChannel(NetworkReceive.java:93)
> at kafka.network.BlockingChannel.readCompletely(
> BlockingChannel.scala:129)
> at kafka.network.BlockingChannel.receive(BlockingChannel.scala:
> 120)
> at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.
> scala:99)
> at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$
> $sendRequest(SimpleConsumer.scala:83)
> at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$
> apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:132)
> at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$
> apply$mcV$sp$1.apply(SimpleConsumer.scala:132)
> at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$
> apply$mcV$sp$1.apply(SimpleConsumer.scala:132)
> at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(
> SimpleConsumer.scala:131)
> at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(
> SimpleConsumer.scala:131)
> at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(
> SimpleConsumer.scala:131)
> at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:130)
> at kafka.consumer.ConsumerFetcherThread.fetch(
> ConsumerFetcherThread.scala:109)
> at kafka.consumer.ConsumerFetcherThread.fetch(
> ConsumerFetcherThread.scala:29)
> at kafka.server.AbstractFetcherThread.processFetchRequest(
> AbstractFetcherThread.scala:118)
> at kafka.server.AbstractFetcherThread.doWork(
> AbstractFetcherThread.scala:103)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
>


Re: Taking a long time to roll a new log segment (~1 min)

2017-01-13 Thread Ewen Cheslack-Postava
>>> previous
> > >>> > email, I see GC running roughly every 10-12 seconds, with total
> > >>> > times similar to the following:
> > >>> >
> > >>> > 2017-01-12T07:16:46.867-0500: 46891.844: Total time for which
> > >>> application
> > >>> > > threads were stopped: 0.0141281 seconds, Stopping threads took:
> > >>> 0.0002171
> > >>> > > seconds
> > >>> > >
> > >>> > >
> > >>> > Here is a link to a GCEasy report:
> > >>> >
> > >>> > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTcvMDEvMTIv
> > >>> LS10b3RhbEdDLWthZmthMS00LmxvZy5nei0tMTUtMzQtNTk=
> > >>> >
> > >>> >
> > >>> > Currently using G1 gc with the following settings:
> > >>> >
> > >>> > -Xmx12G -Xms12G -server -XX:MaxPermSize=48M -verbose:gc
> > >>> > -Xloggc:/var/log/kafka/gc.log -XX:+PrintGCDateStamps
> > >>> -XX:+PrintGCDetails
> > >>> > -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime
> > >>> > -XX:+PrintTLAB -XX:+DisableExplicitGC -XX:+UseGCLogFileRotation
> > >>> > -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M
> > >>> > -XX:+UseCompressedOops -XX:+AlwaysPreTouch -XX:+UseG1GC
> > >>> > -XX:MaxGCPauseMillis=20 -XX:+HeapDumpOnOutOfMemoryError
> > >>> > -XX:HeapDumpPath=/var/log/kafka/heapDump.log
> > >>> > -Xloggc:/opt/kafka/current/bin/../logs/kafkaServer-gc.log
> > >>> > -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps
> > >>> > -XX:+PrintGCTimeStamps
> > >>> >
> > >>> >
> > >>> >
> > >>> > On Thu, Jan 12, 2017 at 9:55 AM, Tauzell, Dave <
> > >>> > dave.tauz...@surescripts.com
> > >>> > > wrote:
> > >>> >
> > >>> > > Can you collect garbage collection stats and verify there isn't
> > >>> > > a
> > >>> long GC
> > >>> > > happening at the same time?
> > >>> > >
> > >>> > > -Dave
> > >>> > >
> > >>> > > -Original Message-
> > >>> > > From: Stephen Powis [mailto:spo...@salesforce.com]
> > >>> > > Sent: Thursday, January 12, 2017 8:34 AM
> > >>> > > To: users@kafka.apache.org
> > >>> > > Subject: Re: Taking a long time to roll a new log segment (~1
> > >>> > > min)
> > >>> > >
> > >>> > > So per the kafka docs I up'd our FD limit to 100k, and we are no
> > >>> longer
> > >>> > > seeing the process die, which is good.
> > >>> > >
> > >>> > > Unfortunately we're still seeing very high log segment roll
> > >>> > > times,
> > >>> and
> > >>> > I'm
> > >>> > > unsure if this is considered 'normal', as it tends to block
> > >>> > > producers during this period.
> > >>> > >
> > >>> > > We are running kafka 0.10.0.1, but I patched in some
> > >>> > > additionally
> > >>> timing
> > >>> > > statements into the kafka.log.log roll() method to narrow down
> > >>> exactly
> > >>> > > which part of that method is taking so long.
> > >>> > >
> > >>> > > Again, typically the process to roll a new log file takes only
> > >>> > > 1-2ms
> > >>> > tops,
> > >>> > > but several times a day it takes 30-60+ seconds, across all of
> > >>> > > our brokers. I've narrowed it down to this bit of code causing
> > >>> > > the
> > >>> issue:
> > >>> > > https://github.com/apache/kafka/blob/0.10.0/core/src/
> > >>> > > main/scala/kafka/log/Log.scala#L652-L658
> > >>> > >
> > >>> > > Here's an example of output w/ my additional timing log
> statements:
> > >>> > >
> > >>> > > [2017-01-12 07:17:58,199] INFO Rolled new log segment for
> > >>> 'MyTopic-4' in
> > >>> > > > 28028 ms. (kafka.log.Log)
> > >>> > >
> > >>> &g

Re: Kafka as a data ingest

2017-01-10 Thread Ewen Cheslack-Postava
Will,

The HDFS connector we ship today is for Kafka -> HDFS, so it isn't
reading/processing data in HDFS.

I was discussing both directions because the question was unclear. However,
there's no reason you couldn't create a connector that processes files in
splits to parallelize an HDFS -> Kafka path, even if it was only for a
single file.

-Ewen

On Tue, Jan 10, 2017 at 5:09 AM, Will Du <will...@gmail.com> wrote:

> In terms of big files which is quite often in HDFS, does connect task
> parallel process the same file like what MR deal with split files? I do not
> think so. In this case, Kafka connect implement has no advantages to read
> single big file unless you also use mapreduce.
>
> Sent from my iPhone
>
> On Jan 10, 2017, at 02:41, Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
> >> However, I'm trying to figure out if I can use Kafka to read Hadoop
> file.
> >
> > The question is a bit unclear as to whether you mean "use Kafka to send
> > data to a Hadoop file" or "use Kafka to read a Hadoop file into a Kafka
> > topic". But in both cases, Kafka Connect provides a good option.
> >
> > The more common use case is sending data that you have in Kafka into
> HDFS.
> > In that case,
> > http://docs.confluent.io/3.1.1/connect/connect-hdfs/docs/
> hdfs_connector.html
> > is a good option.
> >
> > If you want the less common case of sending data from HDFS files into a
> > stream of Kafka records, I'm not aware of a connector for doing that yet
> > but it is definitely possible. Kafka Connect takes care of a lot of the
> > details for you so all you have to do is read the file and emit Connect's
> > SourceRecords containing the data from the file. Most other details are
> > handled for you.
> >
> > -Ewen
> >
> >> On Mon, Jan 9, 2017 at 9:18 PM, Sharninder <sharnin...@gmail.com>
> wrote:
> >>
> >> If you want to know if "kafka" can read hadoop files, then no. But you
> can
> >> write your own producer that reads from hdfs any which way and pushes to
> >> kafka. We use kafka as the ingestion pipeline's main queue. Read from
> >> various sources and push everything to kafka.
> >>
> >>
> >> On Tue, Jan 10, 2017 at 6:26 AM, Cas Apanowicz <
> >> cas.apanow...@it-horizon.com
> >>> wrote:
> >>
> >>> Hi,
> >>>
> >>> I have general understanding of main Kafka functionality as a streaming
> >>> tool.
> >>> However, I'm trying to figure out if I can use Kafka to read Hadoop
> file.
> >>> Can you please advise?
> >>> Thanks
> >>>
> >>> Cas
> >>>
> >>>
> >>
> >>
> >> --
> >> --
> >> Sharninder
> >>
>


Re: Json to JDBC using Kafka JDBC connector Sink

2017-01-10 Thread Ewen Cheslack-Postava
Anything with a table structure is probably not going to handle schemaless
data (i.e. JSON) very well without some extra help -- tables usually expect
schemas and JSON doesn't have a schema. As it stands today, the JDBC sink
connector will probably not handle your use case.

To send schemaless data into a schema-based system, you'd probably need to
impose/extract a schema. An upcoming feature called Single Message
Transforms
https://cwiki.apache.org/confluence/display/KAFKA/KIP-66%3A+Single+Message+Transforms+for+Kafka+Connect
could
potentially help do this (in a generic way that doesn't depend on the
Connector being used).

The only alternative would be to update the JDBC sink to handle JSON data
directly. Some databases might handle this if the entire record were
converted to a JSON-type field (i.e. a single-column table), but I'm
guessing you are looking for output that's a bit more structured than that.

-Ewen

On Mon, Jan 9, 2017 at 4:14 PM, Stephane Maarek <
steph...@simplemachines.com.au> wrote:

> Hi,
>
> I’m wondering if the following is feasible…
> I have a json document with pretty much 0 schema. The only thing I know for
> sure is that it’s a json document.
> My goal is to pipe that json document in a postgres table that has two
> columns: id and json. The id column is basically topic+partition+offset (to
> guarantee idempotence on upserts), and the json column is basically the
> json document
>
> Is that feasible using the out of the box JDBC connector? I didn’t see any
> support for “json type” fields
>
> Thanks,
> Stephane
>


Re: Kafka as a data ingest

2017-01-09 Thread Ewen Cheslack-Postava
> However, I'm trying to figure out if I can use Kafka to read Hadoop file.

The question is a bit unclear as to whether you mean "use Kafka to send
data to a Hadoop file" or "use Kafka to read a Hadoop file into a Kafka
topic". But in both cases, Kafka Connect provides a good option.

The more common use case is sending data that you have in Kafka into HDFS.
In that case,
http://docs.confluent.io/3.1.1/connect/connect-hdfs/docs/hdfs_connector.html
is a good option.

If you want the less common case of sending data from HDFS files into a
stream of Kafka records, I'm not aware of a connector for doing that yet
but it is definitely possible. Kafka Connect takes care of a lot of the
details for you so all you have to do is read the file and emit Connect's
SourceRecords containing the data from the file. Most other details are
handled for you.

-Ewen

On Mon, Jan 9, 2017 at 9:18 PM, Sharninder  wrote:

> If you want to know if "kafka" can read hadoop files, then no. But you can
> write your own producer that reads from hdfs any which way and pushes to
> kafka. We use kafka as the ingestion pipeline's main queue. Read from
> various sources and push everything to kafka.
>
>
> On Tue, Jan 10, 2017 at 6:26 AM, Cas Apanowicz <
> cas.apanow...@it-horizon.com
> > wrote:
>
> > Hi,
> >
> > I have general understanding of main Kafka functionality as a streaming
> > tool.
> > However, I'm trying to figure out if I can use Kafka to read Hadoop file.
> > Can you please advise?
> > Thanks
> >
> > Cas
> >
> >
>
>
> --
> --
> Sharninder
>


Re: Taking a long time to roll a new log segment (~1 min)

2017-01-09 Thread Ewen Cheslack-Postava
I can't speak to the exact details of why fds would be kept open longer in
that specific case, but are you aware that the recommendation for
production clusters for open fd limits is much higher? It's been suggested
to be 100,000 as a starting point for quite awhile:
http://kafka.apache.org/documentation.html#os

-Ewen

On Mon, Jan 9, 2017 at 12:45 PM, Stephen Powis 
wrote:

> Hey!
>
> I've run into something concerning in our production clusterI believe
> I've posted this question to the mailing list previously (
> http://mail-archives.apache.org/mod_mbox/kafka-users/201609.mbox/browser)
> but the problem has become considerably more serious.
>
> We've been fighting issues where Kafka 0.10.0.1 hits its max file
> descriptor limit.  Our limit is set to ~16k, and under normal operation it
> holds steady around 4k open files.
>
> But occasionally Kafka will roll a new log segment, which typically takes
> on the order of magnitude of a few milliseconds.  However...sometimes it
> will take a considerable amount of time, any where from 40 seconds up to
> over a minute.  When this happens, it seems like connections are not
> released by kafka, and we end up with thousands of client connections stuck
> in CLOSE_WAIT, which pile up and exceed our max file descriptor limit.
> This happens all in the span of about a minute.
>
> Our logs look like this:
>
> [2017-01-08 01:10:17,117] INFO Rolled new log segment for 'MyTopic-8' in
> > 41122 ms. (kafka.log.Log)
> > [2017-01-08 01:10:32,550] INFO Rolled new log segment for 'MyTopic-4' in
> 1
> > ms. (kafka.log.Log)
> > [2017-01-08 01:11:10,039] INFO [Group Metadata Manager on Broker 4]:
> > Removed 0 expired offsets in 0 milliseconds.
> > (kafka.coordinator.GroupMetadataManager)
> > [2017-01-08 01:19:02,877] ERROR Error while accepting connection
> > (kafka.network.Acceptor)
> > java.io.IOException: Too many open files   at
> > sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
> >
> at
> > sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:
> 422)
> > at
> > sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:
> 250)
> > at kafka.network.Acceptor.accept(SocketServer.scala:323)
> > at kafka.network.Acceptor.run(SocketServer.scala:268)
> > at java.lang.Thread.run(Thread.java:745)
> > [2017-01-08 01:19:02,877] ERROR Error while accepting connection
> > (kafka.network.Acceptor)
> > java.io.IOException: Too many open files
> > at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
> > at
> > sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:
> 422)
> > at
> > sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:
> 250)
> > at kafka.network.Acceptor.accept(SocketServer.scala:323)
> > at kafka.network.Acceptor.run(SocketServer.scala:268)
> > at java.lang.Thread.run(Thread.java:745)
> > .
> >
>
>
> And then kafka crashes.
>
> Has anyone seen this behavior of slow log segmented being rolled?  Any
> ideas of how to track down what could be causing this?
>
> Thanks!
> Stephen
>


Re: compaction + delete not working for me

2017-01-06 Thread Ewen Cheslack-Postava
On Fri, Jan 6, 2017 at 3:57 AM, Mike Gould  wrote:

> Hi
>
> I'm trying to configure log compaction + deletion as per KIP-71 in kafka
> 0.10.1 but so far haven't had any luck. My tests show more than 50%
> duplicate keys when reading from the beginning even several minutes after
> all the events were sent.
> The documentation in section 3.1 doesn't seem very clear to me in terms of
> exactly how to configure particular behavior. Could someone please clarify
> a few things for me?
>
> In order to significantly reduce the amount of data that new subscribers
> have to receive I want to compact events as soon as possible, and delete
> any events more than 24 hours old (e.g if there hasn't been an update with
> a matching key for 24h).
>
> I have set
>
> cleanup.policy=compact, delete
> min.cleanable.dirty.ratio=0.5
> min.compaction.lag.ms=0
> retention.ms=8640
> delete.retention.ms=8646
> segment.ms=6
>
>
>- Should the cleanup.policy be "compact,delete" or "compact, delete" or
>something else?
>

Either should work, extra leading and trailing spaces are removed.


>- Are events eligible for compaction soon after the
> min.compaction.lag.ms
>time and segment.ms or is there another parameter that affects this?

   I.e. if I read from the beginning after a couple of minutes should I see
> no
>more than 50% of the events received have the same key as previous
> events.
>

Maybe you need to modify log.retention.check.interval.ms? It defaults to 5
minutes. The log cleaning runs periodically, so you may just not have
waited log enough for cleaning to have executed.


>- Does the retention.ms parameter only affect the deletion?
>- How can I tell if the config is accepted and compaction is working? Is
>there something useful to search for in the logs?
>

Check for logs from LogCleaner.scala. It should log some info when it runs.


>- Also if I change the topic config via the kafka-configs.sh tool does
>the change take effect immediately for existing events, do I have to
>restart the brokers, or does it only affect new events?
>

Topic config changes shouldn't need a broker restart.

-Ewen


>
> Thank you
> Mike G
>


Re: Does offsetsForTimes use createtime of logsegment file?

2017-01-06 Thread Ewen Cheslack-Postava
It would return the earlier one, offset 0.

-Ewen

On Thu, Jan 5, 2017 at 10:15 PM, Vignesh <vignesh.v...@gmail.com> wrote:

> Thanks. I didn't realize ListOffsetRequestV1 is only available 0.10.1
> (which has KIP-33, time index).
> When timestamp is set by user (CreationTime), and it is not always
> increasing, would this method still return the offset of first message with
> timestamp greater than equal to the provided timestamp?
>
>
> For example, in below scenario
>
> Message1, Timestamp = T1, Offset = 0
> Message2, Timestamp = T0 (or T2), Offset = 1
> Message3, Timestamp = T1, Offset = 2
>
>
> Would offsetForTimestamp(T1) return offset for earliest message with
> timestamp T1 (i.e. Offset 0 in above example) ?
>
>
> -Vignesh.
>
> On Thu, Jan 5, 2017 at 8:19 PM, Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
> > On Wed, Jan 4, 2017 at 11:54 PM, Vignesh <vignesh.v...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > offsetsForTimes
> > > <https://kafka.apache.org/0101/javadoc/org/apache/kafka/
> > clients/consumer/
> > > KafkaConsumer.html#offsetsForTimes(java.util.Map)>
> > > function
> > > returns offset for a given timestamp. Does it use message's timestamp
> > > (which could be LogAppendTime or set by user) or creation time of
> > > logsegment file?
> > >
> > >
> > This is actually tied to how the ListOffsetsRequest is handled. But if
> > you're on a recent version, then the KIP
> > https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=65868090
> > made it use the more accurate version based on message timestamps.
> >
> >
> > >
> > > KIP-33
> > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 33+-+Add+a+time+based+log+index>
> > > adds timestamp based index, and it is available only from 0.10.1 . Does
> > >  above function work on 0.10.0 ? If so, are there any differences in
> how
> > it
> > > works between versions 0.10.0 and 0.10.1 ?
> > >
> > >
> > The KIP was only adopted and implemented in 0.10.1+. It is not available
> in
> > 0.10.0.
> >
> >
> > > Thanks,
> > > Vignesh.
> > >
> >
>


Re: One big kafka connect cluster or many small ones?

2017-01-06 Thread Ewen Cheslack-Postava
Yeah, you'd set the key.converter and/or value.converter in your connector
config.

-Ewen

On Thu, Jan 5, 2017 at 9:50 PM, Stephane Maarek <
steph...@simplemachines.com.au> wrote:

> Thanks!
> So I just override the conf while doing the API call? It’d be great to
> have this documented somewhere on the confluent website. I couldn’t find
> it.
>
> On 6 January 2017 at 3:42:45 pm, Ewen Cheslack-Postava (e...@confluent.io)
> wrote:
>
> On Thu, Jan 5, 2017 at 7:19 PM, Stephane Maarek <
> steph...@simplemachines.com.au> wrote:
>
>> Thanks a lot for the guidance, I think we’ll go ahead with one cluster. I
>> just need to figure out how our CD pipeline can talk to our connect cluster
>> securely (because it’ll need direct access to perform API calls).
>>
>
> The documentation isn't great here, but you can apply all the normal
> security configs to Connect (in distributed mode, it's basically equivalent
> to a consumer, so everything you can do with a consumer you can do with
> Connect).
>
>
>>
>> Lastly, a question or maybe a piece of feedback… is it not possible to
>> specify the key serializer and deserializer as part of the rest api job
>> config?
>> The issue is that sometimes our data is avro, sometimes it’s json. And it
>> seems I’d need two separate clusters for that?
>>
>
> This is new! As of 0.10.1.0, we have https://cwiki.apache.org/
> confluence/display/KAFKA/KIP-75+-+Add+per-connector+Converters which
> allows you to include it in the connector config. It's called "Converter"
> in Connect because it does a bit more than ser/des if you've written them
> for Kafka, but they are basically just pluggable ser/des. We knew folks
> would want this, it just took us awhile to find the bandwidth to implement
> it. Now, you shouldn't need to do anything special or deploy multiple
> clusters -- it's baked in and supported as long as you are willing to
> override it on a per-connector basis (and this seems reasonable for most
> folks since *ideally* you are *somewhat* standardized on a common
> serialization format).
>
> -Ewen
>
>
>>
>> On 6 January 2017 at 1:54:10 pm, Ewen Cheslack-Postava (e...@confluent.io)
>> wrote:
>>
>> On Thu, Jan 5, 2017 at 3:12 PM, Stephane Maarek <
>> steph...@simplemachines.com.au> wrote:
>>
>> > Hi,
>> >
>> > We like to operate in micro-services (dockerize and ship everything on
>> ecs)
>> > and I was wondering which approach was preferred.
>> > We have one kafka cluster, one zookeeper cluster, etc, but when it
>> comes to
>> > kafka connect I have some doubts.
>> >
>> > Is it better to have one big kafka connect with multiple nodes, or many
>> > small kafka connect clusters or standalone, for each connector / etl ?
>> >
>>
>> You can do any of these, and it may depend on how you do
>> orchestration/deployment.
>>
>> We built Connect to support running one big cluster running a bunch of
>> connectors. It balances work automatically and provides a way to control
>> scale up/down via increased parallelism. This means we don't need to make
>> any assumptions about how you deploy, how you handle elastically scaling
>> your clusters, etc. But if you run in an environment and have the tooling
>> in place to do that already, you can also opt to run many smaller clusters
>> and use that tooling to scale up/down. In that case you'd just make sure
>> there were enough tasks for each connector so that when you scale the # of
>> workers for a cluster up the rebalancing of work would ensure there was
>> enough tasks for every worker to remain occupied.
>>
>> The main drawback of doing this is that Connect uses a few topics to for
>> configs, status, and offsets and you need these to be unique per cluster.
>> This means you'll have 3N more topics. If you're running a *lot* of
>> connectors, that could eventually become a problem. It also means you have
>> that many more worker configs to handle, clusters to monitor, etc. And
>> deploying a connector no longer becomes as simple as just making a call to
>> the service's REST API since there isn't a single centralized service. The
>> main benefits I can think of are a) if you already have preferred tooling
>> for handling elasticity and b) better resource isolation between
>> connectors
>> (i.e. an OOM error in one connector won't affect any other connectors).
>>
>> For standalone mode, we'd generally recommend only using it when
>> distributed mode doesn't make sense, e.g. for log file collection. Other
>> than that, having the fault to

Re: Apache Kafka integration using Apache Camel

2017-01-05 Thread Ewen Cheslack-Postava
More generally, do you have any log errors/messages or additional info?
It's tough to debug issues like this from 3rd party libraries if they don't
provide logs/exception info that indicates why processing a specific
message failed.

-Ewen

On Thu, Jan 5, 2017 at 8:29 PM, UMESH CHAUDHARY  wrote:

> Did you test that kafka console consumer is displaying the produced
> message?
>
> On Fri, Jan 6, 2017 at 9:18 AM, Gupta, Swati  wrote:
>
> > Hello All,
> >
> >
> >
> > I am trying to create a Consumer using Apache Camel for a topic in Apache
> > Kafka.
> > I am using Camel 2.17.0 and Kafka 0.10  and JDK 1.8.
> > I have attached a file, KafkaCamelTestConsumer.java which is a standalone
> > application trying to read from a topic  “test1”created in Apache Kafka
> > I am producing messages from the console and also was successful to
> > produce messages using a Camel program in the topic "test1", but not able
> > to consume messages. Ideally, it should get printed, but nothing seems to
> > happen. The log says that the route has started but does not process any
> > message.
> >
> > Please help to confirm if there is anything wrong with the below syntax:
> >
> > from(*"kafka:localhost:9092?topic=test1=testingGroupNew&
> autoOffsetReset=earliest"
> > *+
> >
> > *"=1=org.apache.
> kafka.common.serialization.StringDeserializer&"
> > *+
> > *"valueDeserializer=org.apache.kafka.common.serialization.
> StringDeserializer"
> > *+
> > *"=1000=3&
> autoCommitEnable=true"*
> > ).split()
> > .body()
> > .process(*new *Processor() {
> > @Override
> > *public void *process(Exchange exchange)
> > *throws *Exception {
> > String messageKey = *""*;
> > *if *(exchange.getIn() != *null*) {
> > Message message = exchange.getIn();
> > Integer partitionId = (Integer) message
> > .getHeader(KafkaConstants.*
> PARTITION*
> > );
> > String topicName = (String) message
> > .getHeader(KafkaConstants.*TOPIC*);
> > *if *(message.getHeader(
> KafkaConstants.*KEY*)
> > != *null*)
> > messageKey = (String) message
> > .getHeader(KafkaConstants.*
> KEY*);
> > Object data = message.getBody();
> >
> >
> > System.*out*.println(
> > *"topicName :: " *+ topicName +
> > *" partitionId :: " *+ partitionId +
> > *" messageKey :: " *+ messageKey +
> > *" message :: " *+ data +
> *"**\n**"*);
> > }
> > }
> > }).to(
> > *"file://C:/swati/?fileName=MyOutputFile.txt=utf-8"*);
> > }
> > });
> >
> >
> >
> > I have also tried with the basic parameters as below and it still fails
> to
> > read messages.
> >
> > from(
> > *"kafka:localhost:9092?topic=test1=testingGroupNew&
> autoOffsetReset=earliest")*
> >
> > Any help on this will be greatly appreciated.
> >
> > Thanks in advance
> >
> >
> >
> > Thanks & Regards
> >
> > Swati
> >
> > --
> > This e-mail and any attachments to it (the "Communication") is, unless
> > otherwise stated, confidential, may contain copyright material and is for
> > the use only of the intended recipient. If you receive the Communication
> in
> > error, please notify the sender immediately by return e-mail, delete the
> > Communication and the return e-mail, and do not read, copy, retransmit or
> > otherwise deal with it. Any views expressed in the Communication are
> those
> > of the individual sender only, unless expressly stated to be those of
> > Australia and New Zealand Banking Group Limited ABN 11 005 357 522, or
> any
> > of its related entities including ANZ Bank New Zealand Limited (together
> > "ANZ"). ANZ does not accept liability in connection with the integrity of
> > or errors in the Communication, computer virus, data corruption,
> > interference or delay arising from or in respect of the Communication.
> >
> >
>


Re: One big kafka connect cluster or many small ones?

2017-01-05 Thread Ewen Cheslack-Postava
On Thu, Jan 5, 2017 at 7:19 PM, Stephane Maarek <
steph...@simplemachines.com.au> wrote:

> Thanks a lot for the guidance, I think we’ll go ahead with one cluster. I
> just need to figure out how our CD pipeline can talk to our connect cluster
> securely (because it’ll need direct access to perform API calls).
>

The documentation isn't great here, but you can apply all the normal
security configs to Connect (in distributed mode, it's basically equivalent
to a consumer, so everything you can do with a consumer you can do with
Connect).


>
> Lastly, a question or maybe a piece of feedback… is it not possible to
> specify the key serializer and deserializer as part of the rest api job
> config?
> The issue is that sometimes our data is avro, sometimes it’s json. And it
> seems I’d need two separate clusters for that?
>

This is new! As of 0.10.1.0, we have
https://cwiki.apache.org/confluence/display/KAFKA/KIP-75+-+Add+per-connector+Converters
which allows you to include it in the connector config. It's called
"Converter" in Connect because it does a bit more than ser/des if you've
written them for Kafka, but they are basically just pluggable ser/des. We
knew folks would want this, it just took us awhile to find the bandwidth to
implement it. Now, you shouldn't need to do anything special or deploy
multiple clusters -- it's baked in and supported as long as you are willing
to override it on a per-connector basis (and this seems reasonable for most
folks since *ideally* you are *somewhat* standardized on a common
serialization format).

-Ewen


>
> On 6 January 2017 at 1:54:10 pm, Ewen Cheslack-Postava (e...@confluent.io)
> wrote:
>
> On Thu, Jan 5, 2017 at 3:12 PM, Stephane Maarek <
> steph...@simplemachines.com.au> wrote:
>
> > Hi,
> >
> > We like to operate in micro-services (dockerize and ship everything on
> ecs)
> > and I was wondering which approach was preferred.
> > We have one kafka cluster, one zookeeper cluster, etc, but when it comes
> to
> > kafka connect I have some doubts.
> >
> > Is it better to have one big kafka connect with multiple nodes, or many
> > small kafka connect clusters or standalone, for each connector / etl ?
> >
>
> You can do any of these, and it may depend on how you do
> orchestration/deployment.
>
> We built Connect to support running one big cluster running a bunch of
> connectors. It balances work automatically and provides a way to control
> scale up/down via increased parallelism. This means we don't need to make
> any assumptions about how you deploy, how you handle elastically scaling
> your clusters, etc. But if you run in an environment and have the tooling
> in place to do that already, you can also opt to run many smaller clusters
> and use that tooling to scale up/down. In that case you'd just make sure
> there were enough tasks for each connector so that when you scale the # of
> workers for a cluster up the rebalancing of work would ensure there was
> enough tasks for every worker to remain occupied.
>
> The main drawback of doing this is that Connect uses a few topics to for
> configs, status, and offsets and you need these to be unique per cluster.
> This means you'll have 3N more topics. If you're running a *lot* of
> connectors, that could eventually become a problem. It also means you have
> that many more worker configs to handle, clusters to monitor, etc. And
> deploying a connector no longer becomes as simple as just making a call to
> the service's REST API since there isn't a single centralized service. The
> main benefits I can think of are a) if you already have preferred tooling
> for handling elasticity and b) better resource isolation between
> connectors
> (i.e. an OOM error in one connector won't affect any other connectors).
>
> For standalone mode, we'd generally recommend only using it when
> distributed mode doesn't make sense, e.g. for log file collection. Other
> than that, having the fault tolerance and high availability of distributed
> mode is preferred.
>
> On your specific points:
>
> >
> > The issues I’m trying to address are :
> > - Integration with our CI/CD pipeline
> >
>
> I'm not sure anything about Connect affects this. Is there a specific
> concern you have about the CI/CD pipeline & Connect?
>
>
> > - Efficient resources utilisation
> >
>
> Putting all the connectors into one cluster will probably result in better
> resource utilization unless you're already automatically tracking usage
> and
> scaling appropriately. The reason is that if you use a bunch of small
> clusters, you're now stuck trying to optimize N uses. Since Connect can
> already (roughly) balance work, putting all the work into one c

Re: Consumer Rebalancing Question

2017-01-05 Thread Ewen Cheslack-Postava
Not sure I understand your question about flapping. The LeaveGroupRequest
is only sent on a graceful shutdown. If a consumer knows it is going to
shutdown, it is good to proactively make sure the group knows it needs to
rebalance work because some of the partitions that were handled by the
consumer need to be handled by some other group members.

There's no "flapping" in the sense that the leave group requests should
just inform the other members that they need to take over some of the work.
I would normally think of "flapping" as meaning that things start/stop
unnecessarily. In this case, *someone* needs to deal with the rebalance and
pick up the work being dropped by the worker. There's no flapping because
it's a one-time event -- one worker is shutting down, decides to drop the
work, and a rebalance sorts it out and reassigns it to another member of
the group. This happens once and then the "issue" is resolved without any
additional interruptions.

-Ewen

On Thu, Jan 5, 2017 at 3:01 PM, Pradeep Gollakota <pradeep...@gmail.com>
wrote:

> I see... doesn't that cause flapping though?
>
> On Wed, Jan 4, 2017 at 8:22 PM, Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
> > The coordinator will immediately move the group into a rebalance if it
> > needs it. The reason LeaveGroupRequest was added was to avoid having to
> > wait for the session timeout before completing a rebalance. So aside from
> > the latency of cleanup/committing offests/rejoining after a heartbeat,
> > rolling bounces should be fast for consumer groups.
> >
> > -Ewen
> >
> > On Wed, Jan 4, 2017 at 5:19 PM, Pradeep Gollakota <pradeep...@gmail.com>
> > wrote:
> >
> > > Hi Kafka folks!
> > >
> > > When a consumer is closed, it will issue a LeaveGroupRequest. Does
> anyone
> > > know how long the coordinator waits before reassigning the partitions
> > that
> > > were assigned to the leaving consumer to a new consumer? I ask because
> > I'm
> > > trying to understand the behavior of consumers if you're doing a
> rolling
> > > restart.
> > >
> > > Thanks!
> > > Pradeep
> > >
> >
>


Re: MirrorMaker - Topics Identification and Replication

2017-01-05 Thread Ewen Cheslack-Postava
That all sounds right!

Usually the delay for picking up the metadata update and starting to
replicate the topic won't be an issue since it's a one-time issue (around
topic creation) and the time window is pretty small for that. In steady
state, none of the mentioned delays would apply.

-Ewen

On Thu, Jan 5, 2017 at 3:46 AM, Greenhorn Techie <greenhorntec...@gmail.com>
wrote:

> Thanks Ewen for your response.
>
> Just to summarise, here is my understanding. Apologies if something is
> mis-understood. I am new to Kafka and hence still short in knowledge.
>
>
>- MirrorMarker process automatically picks-up new topics added on the
>source cluster and hence no restart of the process is needed at regular
>intervals to update the list of topics
>- For existing topics, MirrorMaker will replicate messages to the target
>kafka cluster as and when it sees data in the source kafka cluster's
> topics
>- However for new topics, there might be a delay of up to (default) 5
>min i.e. metadata refresh interval to start replicating the data to the
>target kafka cluster
>
> Please let me know if something is wrong in my understanding.
>
> Thanks
>
> On Tue, 3 Jan 2017 at 23:24 Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
> > Yes, the consumer will pick up the new topics when it refreshes metadata
> > (defaults to every 5 min) and start subscribing to the new topics.
> >
> > -Ewen
> >
> > On Tue, Jan 3, 2017 at 3:07 PM, Greenhorn Techie <
> > greenhorntec...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I am new to Kafka and as well as MirrorMaker. So wondering whether MM
> > would
> > > pick-up new topics that are created on the source cluster
> automatically,
> > > provided topic matches the while list pattern?
> > >
> > > For example, if I start MM as below, would it replicate any new topics
> > that
> > > are created after the MM process is launched?
> > >
> > > kafka-mirror-maker.sh --new.consumer --consumer.config
> /opt/kafka/config/
> > > consumer.properties --producer.config /opt/kafka/config/producer.
> > > properties
> > > --whitelist ".*"
> > >
> > > Or is there a need to restart MM process when a new topic / pattern is
> > > started on the source cluster?
> > >
> > > What should be the recommended approach in regards to this?
> > >
> > > Thanks
> > >
> >
>


Re: Does offsetsForTimes use createtime of logsegment file?

2017-01-05 Thread Ewen Cheslack-Postava
On Wed, Jan 4, 2017 at 11:54 PM, Vignesh  wrote:

> Hi,
>
> offsetsForTimes
>  KafkaConsumer.html#offsetsForTimes(java.util.Map)>
> function
> returns offset for a given timestamp. Does it use message's timestamp
> (which could be LogAppendTime or set by user) or creation time of
> logsegment file?
>
>
This is actually tied to how the ListOffsetsRequest is handled. But if
you're on a recent version, then the KIP
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868090
made it use the more accurate version based on message timestamps.


>
> KIP-33
>  33+-+Add+a+time+based+log+index>
> adds timestamp based index, and it is available only from 0.10.1 . Does
>  above function work on 0.10.0 ? If so, are there any differences in how it
> works between versions 0.10.0 and 0.10.1 ?
>
>
The KIP was only adopted and implemented in 0.10.1+. It is not available in
0.10.0.


> Thanks,
> Vignesh.
>


Re: Is this a bug or just unintuitive behavior?

2017-01-05 Thread Ewen Cheslack-Postava
The basic issue here is just that the auto.offset.reset defaults to latest,
right? That's not a very good setting for a mirroring tool and this seems
like something we might just want to change the default for. It's debatable
whether it would even need a KIP.

We have other settings in MM where we override them if they aren't set
explicitly but we don't want the normal defaults. Most are producer
properties to avoid duplicates (the acks, retries, max.block.ms, and
max.in.flight.requests.per.connection settings), but there are a couple of
consumer ones too (auto.commit.enable and consumer.timeout.ms).

This is probably something like a 1-line MM patch if someone wants to
tackle it -- the question of whether it needs a KIP or not is,
unfortunately, the more complicated question :(

-Ewen

On Thu, Jan 5, 2017 at 1:10 PM, James Cheng  wrote:

>
> > On Jan 5, 2017, at 12:57 PM, Jeff Widman  wrote:
> >
> > Thanks James and Hans.
> >
> > Will this also happen when we expand the number of partitions in a topic?
> >
> > That also will trigger a rebalance, the consumer won't subscribe to the
> > partition until the rebalance finishes, etc.
> >
> > So it'd seem that any messages published to the new partition in between
> > the partition creation and the rebalance finishing won't be consumed by
> any
> > consumers that have offset=latest
> >
>
> It hadn't occured to me until you mentioned it, but yes, I think it'd also
> happen in those cases.
>
> In the kafka consumer javadocs, they provide a list of things that would
> cause a rebalance:
> http://kafka.apache.org/0101/javadoc/org/apache/kafka/clients/consumer/
> KafkaConsumer.html#subscribe(java.util.Collection,%20org.
> apache.kafka.clients.consumer.ConsumerRebalanceListener) <
> http://kafka.apache.org/0101/javadoc/org/apache/kafka/clients/consumer/
> KafkaConsumer.html#subscribe(java.util.Collection,
> org.apache.kafka.clients.consumer.ConsumerRebalanceListener)>
>
> "As part of group management, the consumer will keep track of the list of
> consumers that belong to a particular group and will trigger a rebalance
> operation if one of the following events trigger -
>
> Number of partitions change for any of the subscribed list of topics
> Topic is created or deleted
> An existing member of the consumer group dies
> A new member is added to an existing consumer group via the join API
> "
>
> I'm guessing that this would affect any of those scenarios.
>
> -James
>
>
> >
> >
> >
> > On Thu, Jan 5, 2017 at 12:40 AM, James Cheng 
> wrote:
> >
> >> Jeff,
> >>
> >> Your analysis is correct. I would say that it is known but unintuitive
> >> behavior.
> >>
> >> As an example of a problem caused by this behavior, it's possible for
> >> mirrormaker to miss messages on newly created topics, even thought it
> was
> >> subscribed to them before topics were creted.
> >>
> >> See the following JIRAs:
> >> https://issues.apache.org/jira/browse/KAFKA-3848 <
> >> https://issues.apache.org/jira/browse/KAFKA-3848>
> >> https://issues.apache.org/jira/browse/KAFKA-3370 <
> >> https://issues.apache.org/jira/browse/KAFKA-3370>
> >>
> >> -James
> >>
> >>> On Jan 4, 2017, at 4:37 PM, h...@confluent.io wrote:
> >>>
> >>> This sounds exactly as I would expect things to behave. If you consume
> >> from the beginning I would think you would get all the messages but not
> if
> >> you consume from the latest offset. You can separately tune the metadata
> >> refresh interval if you want to miss fewer messages but that still won't
> >> get you all messages from the beginning if you don't explicitly consume
> >> from the beginning.
> >>>
> >>> Sent from my iPhone
> >>>
>  On Jan 4, 2017, at 6:53 PM, Jeff Widman  wrote:
> 
>  I'm seeing consumers miss messages when they subscribe before the
> topic
> >> is
>  actually created.
> 
>  Scenario:
>  1) kafka 0.10.1.1 cluster with allow-topic no topics, but supports
> topic
>  auto-creation as soon as a message is published to the topic
>  2) consumer subscribes using topic string or a regex pattern.
> Currently
> >> no
>  topics match. Consumer offset is "latest"
>  3) producer publishes to a topic that matches the string or regex
> >> pattern.
>  4) broker immediately creates a topic, writes the message, and also
>  notifies the consumer group that a rebalance needs to happen to assign
> >> the
>  topic_partition to one of the consumers..
>  5) rebalance is fairly quick, maybe a second or so
>  6) a consumer is assigned to the newly-created topic_partition
> 
>  At this point, we've got a consumer steadily polling the recently
> >> created
>  topic_partition. However, the consumer.poll() never returns any
> messages
>  published between topic creation and when the consumer was assigned to
> >> the
>  topic_partition. I'm guessing this may be because when the consumer is
>  assigned 

Re: Metric meaning

2017-01-05 Thread Ewen Cheslack-Postava
There's not currently anything more detaild than what is included in
http://kafka.apache.org/documentation/#monitoring There's some work trying
to automate the generation of that documentation (
https://issues.apache.org/jira/browse/KAFKA-3480). That combined with some
addition to give longer descriptions for metrics could potentially help
this situation.

That said, for the example you mentioned, this is just what the description
says: when a log segment gets flushed to disk (including the index and time
index files), this tracks how long that flush takes.

-Ewen

On Thu, Jan 5, 2017 at 7:21 AM, Robert Quinlivan 
wrote:

> Hello,
>
> Are there more detailed descriptions available for the metrics exposed by
> Kafka via JMX? The current documentation provides some information but a
> few metrics are not listed in detail – for example, "Log flush rate and
> time."
>
> --
> Robert Quinlivan
> Software Engineer, Signal
>


Re: Query on MirrorMaker Replication - Bi-directional/Failover replication

2017-01-05 Thread Ewen Cheslack-Postava
On Thu, Jan 5, 2017 at 3:07 AM, Greenhorn Techie 
wrote:

> Hi,
>
> We are planning to setup MirrorMaker based Kafka replication for DR
> purposes. The base requirement is to have a DR replication from primary
> (site1) to DR site  (site2)using MirrorMaker,
>
> However, we need the solution to work in case of failover as well i.e.
> where in the event of the site1 kafka cluster failing, site2 kafka cluster
> would be made primary. Later when site1 cluster eventually comes back-up
> online, direction of replication would be from site2 to site1.
>
> But as I understand, the offsets on each of the clusters are different, so
> wondering how to design the solution given this constraint and
> requirements.
>

It turns out this is tricky. And once you start digging in you'll find it's
way more complicated than you might originally think.

Before going down the rabbit hole, I'd suggest taking a look at this great
talk by Jun Rao (one of the original authors of Kafka) about multi-DC Kafka
setups: https://www.youtube.com/watch?v=Dvk0cwqGgws

Additionally, I want to mention that while it is tempting to want to treat
multi-DC DR cases in a way that we get really convenient, strongly
consistent, highly available behavior because that makes it easier to
reason about and avoids pushing much of the burden down to applications,
that's not realistic or practical. And honestly, it's rarely even
necessary. DR cases really are DR. Usually it is possible to make some
tradeoffs you might not make under normal circumstances (the most important
one being the tradeoff between possibly seeing duplicates vs exactly once).
The tension here is often that one team is responsible for maintain the
infrastructure and handling this DR failover scenario, and others are
responsible for the behavior of the applications. The infrastructure team
is responsible for figuring out the DR failover story but if they don't
solve it at the infrastructure layer then they get stuck having to
understand all the current (and future) applications built on that
infrastructure.

That said, here are the details I think you're looking for:

The short answer right now is that doing DR failover like that is not going
to be easy with MM. Confluent is building additional tools to deal with
multi-DC setups because of a bunch of these challenges:
https://www.confluent.io/product/multi-datacenter/

For your specific concern about reversing the direction of replication,
you'd need to build additional tooling to support this. The basic list of
steps would be something like this (assuming non-compacted topics):

1. Use MM normally to replicate your data. Be *very* sure you construct
your setup to ensure *everything* is mirrored (proper # of partitions,
replication factor, topic level configs, etc). (Note that this is something
the Confluent replication solution is addressing that's a significant gap
in MM.)
2. During replication, be sure to record offset deltas for every topic
partition. These are needed to reverse the direction of replication
correctly. Make sure to store them in the backup DC and somewhere very
reliable.
3. Observe DC failure.
4. Decide to do failover. Ensure replication has actually stopped (via your
own tooling, or probably better, by using ACLs to ensure no new data can be
produced from original DC to backup DC)
5. Record all the high watermarks for every topic partition so you know
which data was replicated from the original DC (vs which is new after
failover).
6. Allow failover to proceed. Make the backup DC primary.
7. Once the original DC is back alive, you want to reverse replication and
make it the backup. Lookup the offset deltas, use them to initialize
offsets for the consumer group you'll use to do replication.
8. Go back to the original DC and make sure there isn't any "extra" data,
i.e. stuff that didn't get replicated but was successfully written to the
original DC's cluster. For topic partitions where there is data beyond the
expected offsets, you currently would need to just delete the entire set of
data, or at least to before the offset we expect to start at. (A truncate
operation might be a nice way to avoid having to dump *all* the data, but
doesn't currently exist.)
9. Once you've got the two clusters back in a reasonably synced state with
appropriate starting offsets committed, start up MM again in the reverse
direction.

If this sounds tricky, it turns out that when you add compacted topics,
things get quite a bit messier

-Ewen


>
> Thanks
>


Re: Kafka Connect offset.storage.topic not receiving messages (i.e. how to access Kafka Connect offset metadata?)

2017-01-05 Thread Ewen Cheslack-Postava
On Thu, Jan 5, 2017 at 11:30 AM, Phillip Mann  wrote:

> I am working on setting up a Kafka Connect Distributed Mode application
> which will be a Kafka to S3 pipeline. I am using Kafka 0.10.1.0-1 and Kafka
> Connect 3.1.1-1. So far things are going smoothly but one aspect that is
> important to the larger system I am working with requires knowing offset
> information of the Kafka -> FileSystem pipeline. According to the
> documentation, the offset.storage.topic configuration will be the location
> the distributed mode application uses for storing offset information. This
> makes sense given how Kafka stores consumer offsets in the 'new' Kafka.
> However, after doing some testing with the FileStreamSinkConnector, nothing
> is being written to my offset.storage.topic which is the default value:
> connect-offsets.
>

The documentation may need to be clarified about this -- this same question
has come up at least twice in the past 2 days.

The offset.storage.topic is actually only for source connectors. Source
connectors need to define their own format for offsets since we can't make
assumptions about how source systems define an offset in their stream of
data.

Sink connectors are just reading data out of Kafka and there is already a
good mechanism for tracking offsets, so we use that.


>
> To be specific, I am using a Python Kafka producer to push data to a topic
> and using Kafka Connect with the FileStreamSinkConnect to output the data
> from the topic to a file. This works and behaves as I expect the connector
> to behave. Additionally, when I stop the connector and start the connector,
> the application remembers the state in the topic and there is no data
> duplication. However, when I go to the offset.storage.topic to see what
> offset metadata is stored, there is nothing in the topic.
>
> This is the command that I use:
>
> kafka-console-consumer --bootstrap-server kafka1:9092,kafka2:9092,kafka3:9092
> --topic connect-offsets --from-beginning
>
> I receive this message after letting this command run for a minute or so:
>
> Processed a total of 0 messages
>
> So to summarize, I have 2 questions:
>
>
>   1.  Why is offset metadata not being written to the topic that should be
> storing this even though my distributed application is keeping state
> correctly?
>
>
>   1.  How do I access offset metadata information for a Kafka Connect
> distributed mode application? This is 100% necessary for my team's Lambda
> Architecture implementation of our system.
>

We want to improve this (to expose both read and write operations, since
it's also sometimes useful to be able to manually reset committed offsets):
https://issues.apache.org/jira/browse/KAFKA-3820

For sink connectors, however, you can still get this information directly.
You can use the consumer offset checker to lookup offsets for any consumer
group. For connect, the consumer group for a connector will be
connect-.

-Ewen


>
> Thanks for the help.
>


Re: One big kafka connect cluster or many small ones?

2017-01-05 Thread Ewen Cheslack-Postava
On Thu, Jan 5, 2017 at 3:12 PM, Stephane Maarek <
steph...@simplemachines.com.au> wrote:

> Hi,
>
> We like to operate in micro-services (dockerize and ship everything on ecs)
> and I was wondering which approach was preferred.
> We have one kafka cluster, one zookeeper cluster, etc, but when it comes to
> kafka connect I have some doubts.
>
> Is it better to have one big kafka connect with multiple nodes, or many
> small kafka connect clusters or standalone, for each connector / etl ?
>

You can do any of these, and it may depend on how you do
orchestration/deployment.

We built Connect to support running one big cluster running a bunch of
connectors. It balances work automatically and provides a way to control
scale up/down via increased parallelism. This means we don't need to make
any assumptions about how you deploy, how you handle elastically scaling
your clusters, etc. But if you run in an environment and have the tooling
in place to do that already, you can also opt to run many smaller clusters
and use that tooling to scale up/down. In that case you'd just make sure
there were enough tasks for each connector so that when you scale the # of
workers for a cluster up the rebalancing of work would ensure there was
enough tasks for every worker to remain occupied.

The main drawback of doing this is that Connect uses a few topics to for
configs, status, and offsets and you need these to be unique per cluster.
This means you'll have 3N more topics. If you're running a *lot* of
connectors, that could eventually become a problem. It also means you have
that many more worker configs to handle, clusters to monitor, etc. And
deploying a connector no longer becomes as simple as just making a call to
the service's REST API since there isn't a single centralized service. The
main benefits I can think of are a) if you already have preferred tooling
for handling elasticity and b) better resource isolation between connectors
(i.e. an OOM error in one connector won't affect any other connectors).

For standalone mode, we'd generally recommend only using it when
distributed mode doesn't make sense, e.g. for log file collection. Other
than that, having the fault tolerance and high availability of distributed
mode is preferred.

On your specific points:

>
> The issues I’m trying to address are :
>  - Integration with our CI/CD pipeline
>

I'm not sure anything about Connect affects this. Is there a specific
concern you have about the CI/CD pipeline & Connect?


>  - Efficient resources utilisation
>

Putting all the connectors into one cluster will probably result in better
resource utilization unless you're already automatically tracking usage and
scaling appropriately. The reason is that if you use a bunch of small
clusters, you're now stuck trying to optimize N uses. Since Connect can
already (roughly) balance work, putting all the work into one cluster and
having connect split it up means you just need to watch utilization of the
nodes in that one cluster and scale up or down as appropriate.


>  - Easily add new jar files that connectors depend on with minimal downtime
>

This one is a bit interesting. You shouldn't have any downtime adding jars
in the sense that you can do rolling bounces of Connect. The one caveat is
that the current limitation for how it rebalances work involves halting
work for all connectors/tasks, doing the rebalance, and then starting them
up again. We plan to improve this, but the timeframe for it is still
uncertain. Usually these rebalance steps should be pretty quick. The main
reason this can be a concern is that halting some connectors could take
some time (e.g. because they need to fully flush their data). This means
the period of time your connectors are not processing data during one of
those rebalances is controlled by the "worst" connector.

I would recommend trying a single cluster but monitoring whether you see
stalls due to rebalances. If you do, then moving to multiple clusters might
make sense. (This also, obviously, depends a lot on your SLA for data
delivery.)


>  - Monitoring operations
>

Multiple clusters definitely seems messier and more complicated for this.
There will be more workers in a single cluster, but it's a single service
you need to monitor and maintain.

Hope that helps!

-Ewen


>
> Thanks for your guidance
>
> Regards,
> Stephane
>


Re: Consumer Rebalancing Question

2017-01-04 Thread Ewen Cheslack-Postava
The coordinator will immediately move the group into a rebalance if it
needs it. The reason LeaveGroupRequest was added was to avoid having to
wait for the session timeout before completing a rebalance. So aside from
the latency of cleanup/committing offests/rejoining after a heartbeat,
rolling bounces should be fast for consumer groups.

-Ewen

On Wed, Jan 4, 2017 at 5:19 PM, Pradeep Gollakota 
wrote:

> Hi Kafka folks!
>
> When a consumer is closed, it will issue a LeaveGroupRequest. Does anyone
> know how long the coordinator waits before reassigning the partitions that
> were assigned to the leaving consumer to a new consumer? I ask because I'm
> trying to understand the behavior of consumers if you're doing a rolling
> restart.
>
> Thanks!
> Pradeep
>


Re: Kafka Connect gets into a rebalance loop

2017-01-04 Thread Ewen Cheslack-Postava
Aside from the logs you already have, the best suggestion I have is to
enable trace level logging and try to reproduce -- there are some trace
level logs in the KafkaBasedLog class that this uses which might reveal
something. But it could be an issue in the consumer as well -- it sounds
like it is getting hung up and not actually getting in sync to the end of
the log, but obviously that shouldn't happen since the consumer should be
continuously trying to read any new messages on the topic.

-Ewen

On Wed, Jan 4, 2017 at 9:32 AM, Willy Hoang <
willy.ho...@blueapron.com.invalid> wrote:

> I'm also running into this issue whenever I try to scale up from 1 worker
> to multiple. I found that I can sometimes hack around this by
> (1) waiting for the second worker to come up and start spewing out these
> log messages and then
> (2) sending a request to the REST API to update one of my connectors.
>
> I'm assuming that the update appends a message to the config topic which
> then triggers all the workers to catch up to the newest offset.
>
> "The fact that your worker isn't able to catch up probably indicates
> a connectivity issue or possibly even some misconfiguration"
>
> Since I'm able to sometimes trick the worker into catching up and I see
> multiple workers working smoothly, I'm pretty confident that my
> configurations are all correct. Any advice on how to debug what the
> underlying issue could be?
>
> P.S. The version of Kafka Connect I'm running is
> {"version":"0.10.0.0-cp1","commit":"7aeb2e89dbc741f6"}
> On Sat, Dec 17, 2016 at 7:55 PM, Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
> > The message
> >
> > > Wasn't unable to resume work after last rebalance
> >
> > means that you previous iterations of the rebalance were somehow
> behind/out
> > of sync with other members of the group, i.e. they had not read up to the
> > same point in the config topic so it wouldn't be safe for this worker (or
> > possibly the entire cluster if this worker was the leader) to resume
> work.
> > (I think there's a typo in the log message, it should say "wasn't *able*
> to
> > resume work".)
> >
> > This message indicates the problem:
> >
> > > Catching up to assignment's config offset.
> >
> > The leader was using configs that were newer than this member, so it's
> not
> > safe for it to start its assigned work since it might be using outdated
> > configuration. When it tries to catch up, it continues trying to read up
> > until the end of the config topic, which should be at least as far as the
> > leader indicated its position was. (Another gap in logging: that message
> > should really include the offset it is trying to catch up to, although
> you
> > can also check that manually since it'll always be trying to read to the
> > end of the topic.)
> >
> > This catch up has a timeout which defaults to 3s (which is pretty
> > substantial given the rate at which configs tend to be written and their
> > size). The fact that your worker isn't able to catch up probably
> indicates
> > a connectivity issue or possibly even some misconfiguration where one
> > worker is looking at one cluster/config topic, and the other is in the
> same
> > group in the same cluster but looking at a different cluster/config topic
> > when reading configs.
> >
> > -Ewen
> >
> > On Fri, Dec 16, 2016 at 3:16 AM, Frank Lyaruu <flya...@gmail.com> wrote:
> >
> > > Hi people,
> > >
> > > I've just deployed my Kafka Streams / Connect (I only use a connect
> sink
> > to
> > > mongodb) application on a cluster of four instances (4 containers on 2
> > > machines) and now it seems to get into a sort of rebalancing loop, and
> I
> > > don't get much in mongodb, I've got a little bit of data at the
> > beginning,
> > > but no new data appears.
> > >
> > > The rest of the streams application seems to behave.
> > >
> > > This is what I get in my log, but at a pretty high speed (about 100 per
> > > second):
> > >
> > > Current config state offset 3 is behind group assignment 5, reading to
> > end
> > > of config log
> > > Joined group and got assignment: Assignment{error=0,
> > > leader='connect-2-8fb3bfc4-93f2-4d08-82df-8e7c4b99ec13', leaderUrl='',
> > > offset=5, connectorIds=[KNVB-production-generation-99-person-
> mongosink],
> > > taskIds=[]}
> > > Successfully joined group NHV-production-generation-99-
> person-mongosink
> > > with generation 

Re: Reg: Need info on Kafka Brokers

2017-01-03 Thread Ewen Cheslack-Postava
Unfortunately, I don't think it has been open sourced (it doesn't seem to
be available on https://github.com/paypal).

-Ewen

On Tue, Jan 3, 2017 at 5:54 PM, Jhon Davis  wrote:

> Found an interesting Kafka monitoring tool but no information on whether
> it's not open sourced or not.
>
> https://medium.com/@rrsingh.send/monitoring-kafka-at-scale-
> paypal-254238f6022d
>
> Best,
> J.D.
>
> On Mon, Jan 2, 2017 at 11:01 PM, Sreejith S  wrote:
>
> > Hi Sumit,
> >
> > JMX Metrics will give you lot of in depth information on Kafka.
> >
> > Just try this.
> > https://github.com/srijiths/kafka-connectors/tree/master/kaf
> ka-connect-jmx
> >
> > If you use above , then you should have a custom UI to show the metrics.
> >
> > Also you can try open source kafka monitoring tool
> >
> > https://github.com/yahoo/kafka-manager
> >
> > You can find lot of other open source kafka monitoring tools as well.
> >
> > Regards,
> > Srijith
> >
> > On Mon, Jan 2, 2017 at 6:13 PM, Stevo Slavić  wrote:
> >
> > > Hello Sumit,
> > >
> > > Not yet. AFAIK few topic management requests got introduced in broker
> > side
> > > (see https://issues.apache.org/jira/browse/KAFKA-2229 ) but not yet in
> > > client APIs. Querying/listing topics request doesn't seem even to be
> > > planned yet.
> > >
> > > AdminUtils which talks with ZooKeeper via ZkClient is Kafka client API
> > one
> > > can use to retrieve some information, if you're building your own
> > > monitoring tools or reporting custom metrics. There are already
> multiple
> > > OSS and closed source commercial products for monitoring Kafka which
> have
> > > this functionality.
> > >
> > > Kind regards,
> > > Stevo Slavic.
> > >
> > > On Mon, Jan 2, 2017 at 1:23 PM, Sumit Maheshwari <
> sumitm.i...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I am looking to get information about individual brokers in kafka
> > > cluster.
> > > > The information that I am looking for is:
> > > >
> > > >- List of topics in a broker
> > > >- Partitions for each topic in a broker
> > > >- Metrics like BytesIn/Out Per min, Messages In/Min per topic
> > > >- ...
> > > >
> > > > I have tried looking into the JMX metrics seems but they does not
> > provide
> > > > the above information.
> > > > Is it possible to get these information from the broker itself
> without
> > > > contacting zookeeper?
> > > >
> > > > Thanks,
> > > > Sumit
> > > >
> > >
> >
> >
> >
> > --
> >
> >
> > *Sreejith.S*
> > https://github.com/srijiths/
> > http://srijiths.wordpress.com/
> > tweet2sree@twitter >
> >
>


Re: Why does consumer.subscribe(Pattern) require a ConsumerRebalanceListener?

2017-01-03 Thread Ewen Cheslack-Postava
Tbh, I can't remember the exact details around the discussion of the
addition of this API, but I think this was to minimize API bloat. It's easy
to end up with 83 overloads of methods to handle all the different
combinations of parameters, but just a couple of shorthand overrides cover
the vast majority of use cases. Only providing 2 or 3 shorthand versions of
methods is good enough and requiring the full parameter list for the rest
isn't overly-burdensome for users.

The assumption here is that anyone using regex subscriptions is an "expert"
user that will also understand rebalance listeners. You still don't need to
provide a real implementation if you don't need it -- the
NoOpConsumerRebalanceListener used for the overrides of subscribe that
don't take rebalance listener arguments should work fine here as well. I
think the only concern is that they are technically in the non-public,
internal package, although an equivalent (empty) implementation would work
just as well if you're worried about API compatibility/stability.

-Ewen

On Tue, Jan 3, 2017 at 6:30 PM, James Cheng  wrote:

> Hi,
>
> I was looking at the docs for the consumer, and noticed that when calling
> subscribe() with a regex Pattern, that you are required to pass in a
> ConsumerRebalanceListener. On the other hand, when you use a fixed set of
> topic names (Collection), the ConsumerRebalanceListener is optional
> (that is, there is a subscribe(Collection) that does not require a
> ConsumerRebalanceListener)
>
> http://kafka.apache.org/0101/javadoc/org/apache/kafka/clients/consumer/
> KafkaConsumer.html#subscribe(java.util.regex.Pattern,%
> 20org.apache.kafka.clients.consumer.ConsumerRebalanceListener) <
> http://kafka.apache.org/0101/javadoc/org/apache/kafka/clients/consumer/
> KafkaConsumer.html#subscribe(java.util.regex.Pattern,
> org.apache.kafka.clients.consumer.ConsumerRebalanceListener)>
>
> Why does the regex one require a rebalance listener, whereas the
> fixed-topic one does not? Is it to force the user to think through what
> happens as new topic/partitions appear and disappear?
>
> -James
>
>


Re: MirrorMaker - Topics Identification and Replication

2017-01-03 Thread Ewen Cheslack-Postava
Yes, the consumer will pick up the new topics when it refreshes metadata
(defaults to every 5 min) and start subscribing to the new topics.

-Ewen

On Tue, Jan 3, 2017 at 3:07 PM, Greenhorn Techie 
wrote:

> Hi,
>
> I am new to Kafka and as well as MirrorMaker. So wondering whether MM would
> pick-up new topics that are created on the source cluster automatically,
> provided topic matches the while list pattern?
>
> For example, if I start MM as below, would it replicate any new topics that
> are created after the MM process is launched?
>
> kafka-mirror-maker.sh --new.consumer --consumer.config /opt/kafka/config/
> consumer.properties --producer.config /opt/kafka/config/producer.
> properties
> --whitelist ".*"
>
> Or is there a need to restart MM process when a new topic / pattern is
> started on the source cluster?
>
> What should be the recommended approach in regards to this?
>
> Thanks
>


Re: Kafka Connect Consumer reading messages from Kafka recursively

2017-01-03 Thread Ewen Cheslack-Postava
It's a bit odd (and I just opened a JIRA about it), but you actually need
read permission for the group and read permission for the topic.

There are some error responses which may only be logged at DEBUG level, but
I think they should all be throwing an exception and Kafka Connect would
log that at ERROR level. The only case I can find that doesn't do that is
if topic authorization failed.

-Ewen

On Tue, Jan 3, 2017 at 2:21 PM, Srikrishna Alla <allasrikrish...@gmail.com>
wrote:

> Hi Ewen,
>
> I did not see any "ERROR" messages in the connect logs. But, I checked the
> __consumer_offsets topic and it doesn't have anything in it. Should I
> provide write permissions to this topic for my Kafka client user? I am
> running my consumer using a different user than Kafka user.
>
> Thanks,
> Sri
>
> On Tue, Jan 3, 2017 at 3:40 PM, Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
> > On Tue, Jan 3, 2017 at 12:58 PM, Srikrishna Alla <
> > allasrikrish...@gmail.com>
> > wrote:
> >
> > > Thanks for your response Ewen. I will try to make updates to the
> producer
> > > as suggested. Regd the Sink Connector consumer, Could it be that
> > > connect-offsets topic is not getting updated with the offset
> information
> > > per consumer? In that case, will the connector consume the same
> messages
> > > again and again? Also, if that is the case, how would I be able to
> > > troubleshoot? I am running a secured Kafka setup with SASL_PLAINTEXT
> > setup.
> > > Which users/groups should have access to write to the default topics?
> If
> > > not, please guide me in the right direction.
> > >
> > >
> > For sink connectors we actually don't use the connect-offsets topic. So
> if
> > you only have that one sink connector running, you shouldn't expect to
> see
> > any writes to it. Since sink connectors are just consumer groups, they
> use
> > the existing __consumer_offsets topic for storage and do the commits via
> > the normal consumer commit APIs. For ACLs, you'll want Read access to the
> > Group and the Topic.
> >
> > But I doubt it is ACL issues if you're only seeing this when there is
> heavy
> > load. You could use the consumer offset checker to see if any offsets are
> > committed for the group. Also, is there anything in the logs that might
> > indicate a problem with the consumer committing offsets?
> >
> > -Ewen
> >
> >
> > > Thanks,
> > > Sri
> > >
> > > On Tue, Jan 3, 2017 at 1:59 PM, Ewen Cheslack-Postava <
> e...@confluent.io
> > >
> > > wrote:
> > >
> > > > On Tue, Jan 3, 2017 at 8:38 AM, Srikrishna Alla <
> > > allasrikrish...@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I am using Kafka/Kafka Connect to track certain events happening in
> > my
> > > > > application. This is how I have implemented it -
> > > > > 1. My application is opening a KafkaProducer every time this event
> > > > happens
> > > > > and writes to my topic. My application has several components
> running
> > > in
> > > > > Yarn and so I did not find a way to have just one producer and
> reuse
> > > it.
> > > > > Once the event has been published, producer is closed
> > > > >
> > > >
> > > > KafkaProducer is thread safe, so you can allocate a single producer
> per
> > > > process and use it every time the event occurs on any thread.
> Creating
> > > and
> > > > destroying a producer for every event will be very inefficient -- not
> > > only
> > > > are you opening new TCP connections every time, having to lookup
> > metadata
> > > > every time, etc, you also don't allow the producer to get any benefit
> > > from
> > > > batching so every message will require its own request/response.
> > > >
> > > >
> > > > > 2. I am using Kafka Connect Sink Connector to consume from my topic
> > and
> > > > > write to DB and do other processing.
> > > > >
> > > > > This setup is working great as long as we have a stable number of
> > > events
> > > > > published. The issue I am facing is when we have a huge number of
> > > > events(in
> > > > > thousands within minutes) hitting Kafka. In this case, my Sink
> > > Connector
> > > > is
> > > > > going into a loop and reading events from Kafka recursively and not
> > > > > stopping. What could have triggered this? Please provide your
> > valuable
> > > > > insights.
> > > > >
> > > >
> > > > What exactly do you mean by "reading events from Kafka recursively"?
> > > Unless
> > > > it's hitting some errors that are causing consumers to fall out of
> the
> > > > group uncleanly and then rejoin later, you shouldn't be seeing
> > > duplicates.
> > > > Is there anything from the logs that might help reveal the problem?
> > > >
> > > > -Ewen
> > > >
> > > >
> > > > >
> > > > > Thanks,
> > > > > Sri
> > > > >
> > > >
> > >
> >
>


Re: About Kafka Consumer : synchronous and blocking ?

2017-01-03 Thread Ewen Cheslack-Postava
On Tue, Jan 3, 2017 at 12:59 PM, Paolo Patierno <ppatie...@live.com> wrote:

> So it means that some of them (I.e. seek) just set some local state so
> even if synchronous they are fast. The others involve network communication
> so they can block if there are connection problems. Poll is the only one
> that really block due to the timeout.
>
> From my points of view I think that I have to consider all of them as
> synchronous and blocking in terms of Vert.x implementation.
>

Makes sense. Presumably running it in a thread and exposing an alternative
async API would be fine, just be careful about committing offsets properly!

-Ewen


>
> Thanks
> Paolo
>
> Get Outlook for Android<https://aka.ms/ghei36>
>
> ____
> From: Ewen Cheslack-Postava <e...@confluent.io>
> Sent: Tuesday, January 03, 2017 9:37:38 PM
> To: users@kafka.apache.org
> Subject: Re: About Kafka Consumer : synchronous and blocking ?
>
> That's correct. Aside from commitAsync, all the consumer methods will
> block, although note that some are just local operations that affect
> subsequent method calls (e.g. seek() just sets some state locally). In
> fact, the only call that I think you'd need to actually worry about
> blocking is poll(). That has a timeout so you can avoid blocking if you
> need to (i.e. pass 0), although that will not necessarily be very
> efficient.
>
> -Ewen
>
> On Tue, Jan 3, 2017 at 8:14 AM, Paolo Patierno <ppatie...@live.com> wrote:
>
> > Hi all,
> >
> >
> > I'm working on a Kafka Client (https://github.com/vert-x3/
> > vertx-kafka-client) for Vert.x toolkit (http://vertx.io/).
> >
> >
> > For the way Vert.x works, I need to know if all the Consumer APIs work in
> > a "synchronous" way (so blocking); I see that only commitAsync works
> > asynchronously.
> >
> > Is that true ?
> >
> >
> > In the Vert.x toolkit the "karma" is "not stop the event loop" so I need
> > to make some changes if a Kafka client call can hangs for few seconds.
> >
> >
> > Thanks,
> >
> > Paolo.
> >
> >
> > Paolo Patierno
> > Senior Software Engineer (IoT) @ Red Hat
> > Microsoft MVP on Windows Embedded & IoT
> > Microsoft Azure Advisor
> >
> > Twitter : @ppatierno<http://twitter.com/ppatierno>
> > Linkedin : paolopatierno<http://it.linkedin.com/in/paolopatierno>
> > Blog : DevExperience<http://paolopatierno.wordpress.com/>
> >
>


Re: how to ingest a database with a Kafka Connect cluster in parallel?

2017-01-03 Thread Ewen Cheslack-Postava
It's an implementation detail of the JDBC connector. You could potentially
write a connector that parallelizes at that level, but you lose other
potentially useful properties (e.g. ordering). To split at this level you'd
have to do something like have each task be responsible for a subset of
rowids in the database.

-Ewen

On Tue, Jan 3, 2017 at 1:24 PM, Yuanzhe Yang <yyz1...@gmail.com> wrote:

> Hi Ewen,
>
> Thanks a lot for your reply. So it means we cannot parallelize ingestion of
> one table with multiple processes. Is it because of Kafka Connect or the
> JDBC connector?
>
> Have a nice day.
>
> Best regards,
> Yang
>
>
> 2017-01-03 20:55 GMT+01:00 Ewen Cheslack-Postava <e...@confluent.io>:
>
> > The unit of parallelism in connect is a task. It's only listing one task,
> > so you only have one process copying data. The connector can consume data
> > from within a single *database* in parallel, but each *table* must be
> > handled by a single task. Since your table whitelist only includes a
> single
> > table, the connector will only generate a single task. If you add more
> > tables to the whitelist then you'll see more tasks in the status API
> > output.
> >
> > -Ewen
> >
> > On Tue, Jan 3, 2017 at 4:03 AM, Yuanzhe Yang <yyz1...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > I am trying to run a Kafka Connect cluster to ingest data from a
> > relational
> > > database with jdbc connector.
> > >
> > > I have been investigating many other solutions including Spark, Flink
> and
> > > Flume before using Kafka Connect, but none of them can be used to
> ingest
> > > relational databases in a clusterable way. With "cluster" I mean
> > ingesting
> > > one database with several distributed processes in parallel, instead of
> > > each process in the cluster ingesting different databases. Kafka
> Connect
> > is
> > > the option I am investigating currently. After reading the
> > documentation, I
> > > have not found any clear statement about if my use case can be
> supported,
> > > so I have to make a test to figure it out.
> > >
> > > I created a cluster with the following docker container configuration:
> > >
> > > ---
> > > version: '2'
> > > services:
> > >  zookeeper:
> > >image: confluentinc/cp-zookeeper
> > >hostname: zookeeper
> > >ports:
> > >  - "2181"
> > >environment:
> > >  ZOOKEEPER_CLIENT_PORT: 2181
> > >  ZOOKEEPER_TICK_TIME: 2000
> > >
> > >   broker1:
> > >image: confluentinc/cp-kafka
> > >hostname: broker1
> > >depends_on:
> > >  - zookeeper
> > >ports:
> > >  - '9092'
> > >environment:
> > >  KAFKA_BROKER_ID: 1
> > >  KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
> > >  KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://broker1:9092'
> > >
> > >   broker2:
> > >image: confluentinc/cp-kafka
> > >hostname: broker2
> > >depends_on:
> > >  - zookeeper
> > >ports:
> > >  - '9092'
> > >environment:
> > >  KAFKA_BROKER_ID: 2
> > >  KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
> > >  KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://broker2:9092'
> > >
> > >   broker3:
> > >image: confluentinc/cp-kafka
> > >hostname: broker3
> > >depends_on:
> > >  - zookeeper
> > >ports:
> > >  - '9092'
> > >environment:
> > >  KAFKA_BROKER_ID: 3
> > >  KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
> > >  KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://broker3:9092'
> > >
> > >   schema_registry:
> > >image: confluentinc/cp-schema-registry
> > >hostname: schema_registry
> > >depends_on:
> > >  - zookeeper
> > >  - broker1
> > >  - broker2
> > >  - broker3
> > >ports:
> > >  - '8081'
> > >environment:
> > >  SCHEMA_REGISTRY_HOST_NAME: schema_registry
> > >  SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: 'zookeeper:2181'
> > >
> > >   connect1:
> > >image: confluentinc/cp-kafka-connect
> > >hostname: connect1
> > >depends_on:
> > >  - zookeeper
> > >  - broker1
> > >  - broker2

Re: Kafka Connect Consumer reading messages from Kafka recursively

2017-01-03 Thread Ewen Cheslack-Postava
On Tue, Jan 3, 2017 at 12:58 PM, Srikrishna Alla <allasrikrish...@gmail.com>
wrote:

> Thanks for your response Ewen. I will try to make updates to the producer
> as suggested. Regd the Sink Connector consumer, Could it be that
> connect-offsets topic is not getting updated with the offset information
> per consumer? In that case, will the connector consume the same messages
> again and again? Also, if that is the case, how would I be able to
> troubleshoot? I am running a secured Kafka setup with SASL_PLAINTEXT setup.
> Which users/groups should have access to write to the default topics? If
> not, please guide me in the right direction.
>
>
For sink connectors we actually don't use the connect-offsets topic. So if
you only have that one sink connector running, you shouldn't expect to see
any writes to it. Since sink connectors are just consumer groups, they use
the existing __consumer_offsets topic for storage and do the commits via
the normal consumer commit APIs. For ACLs, you'll want Read access to the
Group and the Topic.

But I doubt it is ACL issues if you're only seeing this when there is heavy
load. You could use the consumer offset checker to see if any offsets are
committed for the group. Also, is there anything in the logs that might
indicate a problem with the consumer committing offsets?

-Ewen


> Thanks,
> Sri
>
> On Tue, Jan 3, 2017 at 1:59 PM, Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
> > On Tue, Jan 3, 2017 at 8:38 AM, Srikrishna Alla <
> allasrikrish...@gmail.com
> > >
> > wrote:
> >
> > > Hi,
> > >
> > > I am using Kafka/Kafka Connect to track certain events happening in my
> > > application. This is how I have implemented it -
> > > 1. My application is opening a KafkaProducer every time this event
> > happens
> > > and writes to my topic. My application has several components running
> in
> > > Yarn and so I did not find a way to have just one producer and reuse
> it.
> > > Once the event has been published, producer is closed
> > >
> >
> > KafkaProducer is thread safe, so you can allocate a single producer per
> > process and use it every time the event occurs on any thread. Creating
> and
> > destroying a producer for every event will be very inefficient -- not
> only
> > are you opening new TCP connections every time, having to lookup metadata
> > every time, etc, you also don't allow the producer to get any benefit
> from
> > batching so every message will require its own request/response.
> >
> >
> > > 2. I am using Kafka Connect Sink Connector to consume from my topic and
> > > write to DB and do other processing.
> > >
> > > This setup is working great as long as we have a stable number of
> events
> > > published. The issue I am facing is when we have a huge number of
> > events(in
> > > thousands within minutes) hitting Kafka. In this case, my Sink
> Connector
> > is
> > > going into a loop and reading events from Kafka recursively and not
> > > stopping. What could have triggered this? Please provide your valuable
> > > insights.
> > >
> >
> > What exactly do you mean by "reading events from Kafka recursively"?
> Unless
> > it's hitting some errors that are causing consumers to fall out of the
> > group uncleanly and then rejoin later, you shouldn't be seeing
> duplicates.
> > Is there anything from the logs that might help reveal the problem?
> >
> > -Ewen
> >
> >
> > >
> > > Thanks,
> > > Sri
> > >
> >
>


Re: About Kafka Consumer : synchronous and blocking ?

2017-01-03 Thread Ewen Cheslack-Postava
That's correct. Aside from commitAsync, all the consumer methods will
block, although note that some are just local operations that affect
subsequent method calls (e.g. seek() just sets some state locally). In
fact, the only call that I think you'd need to actually worry about
blocking is poll(). That has a timeout so you can avoid blocking if you
need to (i.e. pass 0), although that will not necessarily be very efficient.

-Ewen

On Tue, Jan 3, 2017 at 8:14 AM, Paolo Patierno  wrote:

> Hi all,
>
>
> I'm working on a Kafka Client (https://github.com/vert-x3/
> vertx-kafka-client) for Vert.x toolkit (http://vertx.io/).
>
>
> For the way Vert.x works, I need to know if all the Consumer APIs work in
> a "synchronous" way (so blocking); I see that only commitAsync works
> asynchronously.
>
> Is that true ?
>
>
> In the Vert.x toolkit the "karma" is "not stop the event loop" so I need
> to make some changes if a Kafka client call can hangs for few seconds.
>
>
> Thanks,
>
> Paolo.
>
>
> Paolo Patierno
> Senior Software Engineer (IoT) @ Red Hat
> Microsoft MVP on Windows Embedded & IoT
> Microsoft Azure Advisor
>
> Twitter : @ppatierno
> Linkedin : paolopatierno
> Blog : DevExperience
>


Re: Schema reg avro version

2017-01-03 Thread Ewen Cheslack-Postava
It just hasn't been update recently. It isn't just the schema-registry that
needs to be updated since other components use the library as well and we'd
need to avoid potential classpath conflicts, but it should be
straightforward to update.

-Ewen

On Tue, Jan 3, 2017 at 8:45 AM, Scott Ferguson  wrote:

> Hey all,
>
> It looks like the Schema Registry is using Avro 1.7.7 and not 1.8.1. Anyone
> know why?
> https://github.com/confluentinc/schema-registry/blob/master/pom.xml#L55
>
> Thanks,
> Scott
>


Re: Kafka Connect Consumer reading messages from Kafka recursively

2017-01-03 Thread Ewen Cheslack-Postava
On Tue, Jan 3, 2017 at 8:38 AM, Srikrishna Alla 
wrote:

> Hi,
>
> I am using Kafka/Kafka Connect to track certain events happening in my
> application. This is how I have implemented it -
> 1. My application is opening a KafkaProducer every time this event happens
> and writes to my topic. My application has several components running in
> Yarn and so I did not find a way to have just one producer and reuse it.
> Once the event has been published, producer is closed
>

KafkaProducer is thread safe, so you can allocate a single producer per
process and use it every time the event occurs on any thread. Creating and
destroying a producer for every event will be very inefficient -- not only
are you opening new TCP connections every time, having to lookup metadata
every time, etc, you also don't allow the producer to get any benefit from
batching so every message will require its own request/response.


> 2. I am using Kafka Connect Sink Connector to consume from my topic and
> write to DB and do other processing.
>
> This setup is working great as long as we have a stable number of events
> published. The issue I am facing is when we have a huge number of events(in
> thousands within minutes) hitting Kafka. In this case, my Sink Connector is
> going into a loop and reading events from Kafka recursively and not
> stopping. What could have triggered this? Please provide your valuable
> insights.
>

What exactly do you mean by "reading events from Kafka recursively"? Unless
it's hitting some errors that are causing consumers to fall out of the
group uncleanly and then rejoin later, you shouldn't be seeing duplicates.
Is there anything from the logs that might help reveal the problem?

-Ewen


>
> Thanks,
> Sri
>


Re: how to ingest a database with a Kafka Connect cluster in parallel?

2017-01-03 Thread Ewen Cheslack-Postava
The unit of parallelism in connect is a task. It's only listing one task,
so you only have one process copying data. The connector can consume data
from within a single *database* in parallel, but each *table* must be
handled by a single task. Since your table whitelist only includes a single
table, the connector will only generate a single task. If you add more
tables to the whitelist then you'll see more tasks in the status API output.

-Ewen

On Tue, Jan 3, 2017 at 4:03 AM, Yuanzhe Yang  wrote:

> Hi all,
>
> I am trying to run a Kafka Connect cluster to ingest data from a relational
> database with jdbc connector.
>
> I have been investigating many other solutions including Spark, Flink and
> Flume before using Kafka Connect, but none of them can be used to ingest
> relational databases in a clusterable way. With "cluster" I mean ingesting
> one database with several distributed processes in parallel, instead of
> each process in the cluster ingesting different databases. Kafka Connect is
> the option I am investigating currently. After reading the documentation, I
> have not found any clear statement about if my use case can be supported,
> so I have to make a test to figure it out.
>
> I created a cluster with the following docker container configuration:
>
> ---
> version: '2'
> services:
>  zookeeper:
>image: confluentinc/cp-zookeeper
>hostname: zookeeper
>ports:
>  - "2181"
>environment:
>  ZOOKEEPER_CLIENT_PORT: 2181
>  ZOOKEEPER_TICK_TIME: 2000
>
>   broker1:
>image: confluentinc/cp-kafka
>hostname: broker1
>depends_on:
>  - zookeeper
>ports:
>  - '9092'
>environment:
>  KAFKA_BROKER_ID: 1
>  KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
>  KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://broker1:9092'
>
>   broker2:
>image: confluentinc/cp-kafka
>hostname: broker2
>depends_on:
>  - zookeeper
>ports:
>  - '9092'
>environment:
>  KAFKA_BROKER_ID: 2
>  KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
>  KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://broker2:9092'
>
>   broker3:
>image: confluentinc/cp-kafka
>hostname: broker3
>depends_on:
>  - zookeeper
>ports:
>  - '9092'
>environment:
>  KAFKA_BROKER_ID: 3
>  KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
>  KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://broker3:9092'
>
>   schema_registry:
>image: confluentinc/cp-schema-registry
>hostname: schema_registry
>depends_on:
>  - zookeeper
>  - broker1
>  - broker2
>  - broker3
>ports:
>  - '8081'
>environment:
>  SCHEMA_REGISTRY_HOST_NAME: schema_registry
>  SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: 'zookeeper:2181'
>
>   connect1:
>image: confluentinc/cp-kafka-connect
>hostname: connect1
>depends_on:
>  - zookeeper
>  - broker1
>  - broker2
>  - broker3
>  - schema_registry
>ports:
>  - "8083"
>environment:
>  CONNECT_BOOTSTRAP_SERVERS: 'broker1:9092,broker2:9092,broker3:9092'
>  CONNECT_REST_ADVERTISED_HOST_NAME: connect1
>  CONNECT_REST_PORT: 8083
>  CONNECT_GROUP_ID: compose-connect-group
>  CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
>  CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
>  CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
>  CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
>  CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: '
> http://schema_registry:8081
> '
>  CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
>  CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: '
> http://schema_registry:8081'
>  CONNECT_INTERNAL_KEY_CONVERTER: org.apache.kafka.connect.json.
> JsonConverter
>  CONNECT_INTERNAL_VALUE_CONVERTER: org.apache.kafka.connect.json.
> JsonConverter
>  CONNECT_ZOOKEEPER_CONNECT: 'zookeeper:2181'
>
>   connect2:
>image: confluentinc/cp-kafka-connect
>hostname: connect2
>depends_on:
>  - zookeeper
>  - broker1
>  - broker2
>  - broker3
>  - schema_registry
>ports:
>  - "8083"
>environment:
>  CONNECT_BOOTSTRAP_SERVERS: 'broker1:9092,broker2:9092,broker3:9092'
>  CONNECT_REST_ADVERTISED_HOST_NAME: connect2
>  CONNECT_REST_PORT: 8083
>  CONNECT_GROUP_ID: compose-connect-group
>  CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
>  CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
>  CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
>  CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
>  CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: '
> http://schema_registry:8081
> '
>  CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
>  CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: '
> http://schema_registry:8081'
>  CONNECT_INTERNAL_KEY_CONVERTER: org.apache.kafka.connect.json.
> JsonConverter
>  CONNECT_INTERNAL_VALUE_CONVERTER: 

Re: kafka streams and broadcast topic

2017-01-02 Thread Ewen Cheslack-Postava
I think what you're describing could be handled in KStreams by a "global"
KTable. This functionality is currently being discussed/voted on in a KIP
discussion: https://cwiki.apache.org/confluence/pages/viewpa
ge.action?pageId=67633649 The list of interests would be a global KTable
(shared globally across all streams instances) and each instance would
filter/categorize based on that table via a join.

-Ewen

On Fri, Dec 30, 2016 at 3:02 PM, Matt King  wrote:

> I'd like to have the following:
>
> One large stream of content coming through a topic, with Kafka Stream
> filtering to identify records of interest.   I can see how this would be
> sharded to allow scale out to handle a large stream of content.
>
> I would like to have a 2nd, smaller, topic to define the areas of interest
> that would be used for the filtering.  This topic should be available to
> all the stream processing filters.  When a new filter comes up it should be
> able to recreate its state.   As area of interest definitions change these
> changes should also go out to all the filtering applications.
>
> Can this be done directly with Kafka Streams?  I can get the primary stream
> working with a static set of interests and the filtering works fine.  But
> adding in a second input stream I'm having trouble.  There doesn't seem to
> be a way to have the same topic/partition go to all the applications?
> Alternatively I could imagine broadcasting the interests to multiple
> partitions but don't see how that is done.
>
> Perhaps the area-of-interest topic should be done using a plain old Kafka
> producer, sending it to all partitions?
>
> Am I making sense?
>
> Happy New Year
>
> Matt
>


Re: Interesting error message du jour

2016-12-30 Thread Ewen Cheslack-Postava
Jon,

This looks the same as https://issues.apache.org/jira/browse/KAFKA-4563,
although for a different invalid transition. The temporary fix suggested
there is to simply convert the exception to log a warning, which should be
a pretty trivial patch against trunk. It seems there are some transitions
that haven't fully been thought through even if they may actually be valid,
so patching this for now may be the easiest way to unblock your development.

-Ewen

On Fri, Dec 30, 2016 at 9:45 AM, Jon Yeargers 
wrote:

> Attaching the debug log
>
> On Fri, Dec 30, 2016 at 6:39 AM, Jon Yeargers 
> wrote:
>
>> Using 0.10.2.0-snapshot:
>>
>> java.lang.IllegalStateException: Incorrect state transition from
>> ASSIGNING_PARTITIONS to ASSIGNING_PARTITIONS
>>
>> at org.apache.kafka.streams.processor.internals.StreamThread.
>> setState(StreamThread.java:163)
>>
>> at org.apache.kafka.streams.processor.internals.StreamThread.se
>> tStateWhenNotInPendingShutdown(StreamThread.java:175)
>>
>> at org.apache.kafka.streams.processor.internals.StreamThread.
>> access$200(StreamThread.java:71)
>>
>> at org.apache.kafka.streams.processor.internals.StreamThread$1.
>> onPartitionsAssigned(StreamThread.java:234)
>>
>> at org.apache.kafka.clients.consumer.internals.ConsumerCoordina
>> tor.onJoinComplete(ConsumerCoordinator.java:230)
>>
>> at org.apache.kafka.clients.consumer.internals.AbstractCoordina
>> tor.joinGroupIfNeeded(AbstractCoordinator.java:314)
>>
>> at org.apache.kafka.clients.consumer.internals.AbstractCoordina
>> tor.ensureActiveGroup(AbstractCoordinator.java:278)
>>
>> at org.apache.kafka.clients.consumer.internals.ConsumerCoordina
>> tor.poll(ConsumerCoordinator.java:261)
>>
>> at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(
>> KafkaConsumer.java:1039)
>>
>> at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaCo
>> nsumer.java:1004)
>>
>> at org.apache.kafka.streams.processor.internals.StreamThread.
>> runLoop(StreamThread.java:569)
>>
>> at org.apache.kafka.streams.processor.internals.StreamThread.
>> run(StreamThread.java:358)
>>
>
>


Re: Processing time series data in order

2016-12-29 Thread Ewen Cheslack-Postava
The best you can do to ensure ordering today is to set:

acks = all
retries = Integer.MAX_VALUE
max.block.ms = Long.MAX_VALUE
max.in.flight.requests.per.connection = 1

This ensures there's only one outstanding produce request (batch of
messages) at a time, it will be retried indefinitely on retriable errors,
it will be fully replicated before it is acked, and if you run out of
buffer space you will block indefinitely until some of the data is
successfully produced and frees up buffer space. This effectively makes the
scenario you describe impossible.

-Ewen

On Wed, Dec 28, 2016 at 11:46 AM, Ali Akhtar  wrote:

> This will only ensure the order of delivery though, not the actual order of
> the events, right?
>
> I.e if due to network lag or any other reason, if the producer sends A,
> then B, but B arrives before A, then B will be returned before A even if
> they both went to the same partition. Am I correct about that?
>
> Or can I use KTables to ensure A is processed before B? (Both messages will
> have a timestamp which is being extracted by a TimestampExtractor ).
>
> On Tue, Dec 27, 2016 at 8:15 PM, Tauzell, Dave <
> dave.tauz...@surescripts.com
> > wrote:
>
> > If you specify a key with each message then all messages with the same
> key
> > get sent to the same partition.
> >
> > > On Dec 26, 2016, at 23:32, Ali Akhtar  wrote:
> > >
> > > How would I route the messages to a specific partition?
> > >
> > >> On 27 Dec 2016 10:25 a.m., "Asaf Mesika" 
> wrote:
> > >>
> > >> There is a much easier approach: your can route all messages of a
> given
> > Id
> > >> to a specific partition. Since each partition has a single writer you
> > get
> > >> the ordering you wish for. Of course this won't work if your updates
> > occur
> > >> in different hosts.
> > >> Also maybe Kafka streams can help shard the based on item Id to a
> second
> > >> topic
> > >>> On Thu, 22 Dec 2016 at 4:31 Ali Akhtar  wrote:
> > >>>
> > >>> The batch size can be large, so in memory ordering isn't an option,
> > >>> unfortunately.
> > >>>
> > >>> On Thu, Dec 22, 2016 at 7:09 AM, Jesse Hodges <
> hodges.je...@gmail.com>
> > >>> wrote:
> > >>>
> >  Depending on the expected max out of order window, why not order
> them
> > >> in
> >  memory? Then you don't need to reread from Cassandra, in case of a
> > >>> problem
> >  you can reread data from Kafka.
> > 
> >  -Jesse
> > 
> > > On Dec 21, 2016, at 7:24 PM, Ali Akhtar 
> > >> wrote:
> > >
> > > - I'm receiving a batch of messages to a Kafka topic.
> > >
> > > Each message has a timestamp, however the messages can arrive / get
> >  processed out of order. I.e event 1's timestamp could've been a few
> > >>> seconds
> >  before event 2, and event 2 could still get processed before event
> 1.
> > >
> > > - I know the number of messages that are sent per batch.
> > >
> > > - I need to process the messages in order. The messages are
> basically
> >  providing the history of an item. I need to be able to track the
> > >> history
> >  accurately (i.e, if an event occurred 3 times, i need to accurately
> > log
> > >>> the
> >  dates of the first, 2nd, and 3rd time it occurred).
> > >
> > > The approach I'm considering is:
> > >
> > > - Creating a cassandra table which is ordered by the timestamp of
> the
> >  messages.
> > >
> > > - Once a batch of messages has arrived, writing them all to
> > >> cassandra,
> >  counting on them being ordered by the timestamp even if they are
> > >>> processed
> >  out of order.
> > >
> > > - Then iterating over the messages in the cassandra table, to
> process
> >  them in order.
> > >
> > > However, I'm concerned about Cassandra's eventual consistency.
> Could
> > >> it
> >  be that even though I wrote the messages, they are not there when I
> > try
> > >>> to
> >  read them (which would be almost immediately after they are
> written)?
> > >
> > > Should I enforce consistency = ALL to make sure the messages will
> be
> >  available immediately after being written?
> > >
> > > Is there a better way to handle this thru either Kafka streams or
> >  Cassandra?
> > 
> > >>>
> > >>
> > This e-mail and any files transmitted with it are confidential, may
> > contain sensitive information, and are intended solely for the use of the
> > individual or entity to whom they are addressed. If you have received
> this
> > e-mail in error, please notify the sender by reply e-mail immediately and
> > destroy all copies of the e-mail and any attachments.
> >
>


Re: Is it possible for consumers within a single consumer group to have different subscriptions?

2016-12-21 Thread Ewen Cheslack-Postava
It is possible for them to have different subscriptions. Consumers will
only be assigned partitions from topics to which they are subscribed. So if
you need to modify your app use data from an additional topic, you can
safely do a rolling deploy of the updated version and during the period
where there is a mix of both old and new versions of the app only the new
ones will be consuming from the added topic.

-Ewen

On Wed, Dec 21, 2016 at 6:33 PM, Jeff Widman  wrote:

> Searched for a while and not finding a clear answer.
>
> Is it possible for consumers within a single consumer group to have
> different topic subscriptions?
>
> If no, if any one of the consumers calls subscribe() with new topic list,
> how is that subscription propagated to the other consumers in the group? Is
> a rebalance immediately triggered?
>


Re: Producer connect timeouts

2016-12-19 Thread Ewen Cheslack-Postava
Yes, this is something that we could consider fixing in Kafka itself.
Pretty much all timeouts can be customized if the defaults for the
OS/network are larger than make sense for the system. And given the large
default values for some of these timeouts, we probably don't want to rely
on the defaults.

-Ewen

On Mon, Dec 19, 2016 at 8:23 AM, Luke Steensen <luke.steensen@
braintreepayments.com> wrote:

> Makes sense, thanks Ewen.
>
> Is this something we could consider fixing in Kafka itself? I don't think
> the producer is necessarily doing anything wrong, but the end result is
> certainly very surprising behavior. It would also be nice not to have to
> coordinate request timeouts, retries, and the max block configuration with
> system-level configs.
>
>
> On Sat, Dec 17, 2016 at 6:55 PM, Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
> > Without having dug back into the code to check, this sounds right.
> > Connection management just fires off a request to connect and then
> > subsequent poll() calls will handle any successful/failed connections.
> The
> > timeouts wrt requests are handled somewhat differently (the connection
> > request isn't explicitly tied to the request that triggered it, so when
> the
> > latter times out, we don't follow up and timeout the connection request
> > either).
> >
> > So yes, you currently will have connection requests tied to your
> underlying
> > TCP timeout request. This tends to be much more of a problem in public
> > clouds where the handshake request will be silently dropped due to
> firewall
> > rules.
> >
> > The metadata.max.age.ms is a workable solution, but agreed that it's not
> > great. If possible, reducing the default TCP connection timeout isn't
> > unreasonable either -- the defaults are set for WAN connections (and
> > arguably set for WAN connections of long ago), so much more aggressive
> > timeouts are reasonable for Kafka clusters.
> >
> > -Ewen
> >
> > On Fri, Dec 16, 2016 at 1:41 PM, Luke Steensen <luke.steensen@
> > braintreepayments.com> wrote:
> >
> > > Hello,
> > >
> > > Is it correct that producers do not fail new connection establishment
> > when
> > > it exceeds the request timeout?
> > >
> > > Running on AWS, we've encountered a problem where certain very low
> volume
> > > producers end up with metadata that's sufficiently stale that they
> > attempt
> > > to establish a connection to a broker instance that has already been
> > > terminated as part of a maintenance operation. I would expect this to
> > fail
> > > and be retried normally, but it appears to hang until the system-level
> > TCP
> > > connection timeout is reached (2-3 minutes), with the writes themselves
> > > being expired before even a single attempt is made to send them.
> > >
> > > We've worked around the issue by setting `metadata.max.age.ms`
> extremely
> > > low, such that these producers are requesting new metadata much faster
> > than
> > > our maintenance operations are terminating instances. While this does
> > work,
> > > it seems like an unfortunate workaround for some very surprising
> > behavior.
> > >
> > > Thanks,
> > > Luke
> > >
> >
>


Re: TLS

2016-12-19 Thread Ewen Cheslack-Postava
Ruben,

There are step-by-step instructions explained here: http://docs.confluent.
io/3.1.1/kafka/security.html For the purposes of configuring Kafka, the
JAAS details basically boil down to a security configuration in a security
configuration file.

-Ewen

On Mon, Dec 19, 2016 at 8:40 AM, Ruben Poveda Teba  wrote:

> Hello,
>
> I'm trying set SSL/TLS between Kafka-Zookeeper-Kafka but in the
> documentation you explain that the security must set as JAAS, but what is
> JAAS and how configure it?
>
> Un saludo,
>
>
> Rubén Poveda Teba
> Security Infrastructures
> rpov...@sia.es
>


Re: Kafka Connect gets into a rebalance loop

2016-12-17 Thread Ewen Cheslack-Postava
The message

> Wasn't unable to resume work after last rebalance

means that you previous iterations of the rebalance were somehow behind/out
of sync with other members of the group, i.e. they had not read up to the
same point in the config topic so it wouldn't be safe for this worker (or
possibly the entire cluster if this worker was the leader) to resume work.
(I think there's a typo in the log message, it should say "wasn't *able* to
resume work".)

This message indicates the problem:

> Catching up to assignment's config offset.

The leader was using configs that were newer than this member, so it's not
safe for it to start its assigned work since it might be using outdated
configuration. When it tries to catch up, it continues trying to read up
until the end of the config topic, which should be at least as far as the
leader indicated its position was. (Another gap in logging: that message
should really include the offset it is trying to catch up to, although you
can also check that manually since it'll always be trying to read to the
end of the topic.)

This catch up has a timeout which defaults to 3s (which is pretty
substantial given the rate at which configs tend to be written and their
size). The fact that your worker isn't able to catch up probably indicates
a connectivity issue or possibly even some misconfiguration where one
worker is looking at one cluster/config topic, and the other is in the same
group in the same cluster but looking at a different cluster/config topic
when reading configs.

-Ewen

On Fri, Dec 16, 2016 at 3:16 AM, Frank Lyaruu  wrote:

> Hi people,
>
> I've just deployed my Kafka Streams / Connect (I only use a connect sink to
> mongodb) application on a cluster of four instances (4 containers on 2
> machines) and now it seems to get into a sort of rebalancing loop, and I
> don't get much in mongodb, I've got a little bit of data at the beginning,
> but no new data appears.
>
> The rest of the streams application seems to behave.
>
> This is what I get in my log, but at a pretty high speed (about 100 per
> second):
>
> Current config state offset 3 is behind group assignment 5, reading to end
> of config log
> Joined group and got assignment: Assignment{error=0,
> leader='connect-2-8fb3bfc4-93f2-4d08-82df-8e7c4b99ec13', leaderUrl='',
> offset=5, connectorIds=[KNVB-production-generation-99-person-mongosink],
> taskIds=[]}
> Successfully joined group NHV-production-generation-99-person-mongosink
> with generation 6
> Successfully joined group KNVB-production-generation-99-person-mongosink
> with generation 6
> Wasn't unable to resume work after last rebalance, can skip stopping
> connectors and tasks
> Rebalance started
> Wasn't unable to resume work after last rebalance, can skip stopping
> connectors and tasks
> (Re-)joining group KNVB-production-generation-99-person-mongosink
> Current config state offset 3 does not match group assignment 5. Forcing
> rebalance.
> Finished reading to end of log and updated config snapshot, new config log
> offset: 3
> Finished reading to end of log and updated config snapshot, new config log
> offset: 3
> Current config state offset 3 does not match group assignment 5. Forcing
> rebalance.
> Joined group and got assignment: Assignment{error=0,
> leader='connect-1-1893fd59-3ce8-4061-8131-ae36e58f5524', leaderUrl='',
> offset=5, connectorIds=[], taskIds=[]}
> Current config state offset 3 is behind group assignment 5, reading to end
> of config log
> Successfully joined group KNVB-production-generation-99-person-mongosink
> with generation 6
> (Re-)joining group KNVB-production-generation-99-person-mongosink
> Current config state offset 3 does not match group assignment 5. Forcing
> rebalance.Rebalance started
> Current config state offset 3 is behind group assignment 5, reading to end
> of config log
> Catching up to assignment's config offset.
> Successfully joined group NHV-production-generation-99-person-mongosink
> with generation 6
> Joined group and got assignment: Assignment{error=0,
> leader='connect-2-8fb3bfc4-93f2-4d08-82df-8e7c4b99ec13', leaderUrl='',
> offset=5, connectorIds=[], taskIds=[]}
> Catching up to assignment's config offset.
> Joined group and got assignment: Assignment{error=0,
> leader='connect-2-8fb3bfc4-93f2-4d08-82df-8e7c4b99ec13', leaderUrl='',
> offset=5, connectorIds=[], taskIds=[]}
> (Re-)joining group NHV-production-generation-99-person-mongosink
> Wasn't unable to resume work after last rebalance, can skip stopping
> connectors and tasks
> Successfully joined group NHV-production-generation-99-person-mongosink
> with generation 6
> Current config state offset 3 does not match group assignment 5. Forcing
> rebalance.
> Finished reading to end of log and updated config snapshot, new config log
> offset: 3
> Current config state offset 3 does not match group assignment 5. Forcing
> rebalance.
> Rebalance started
>
> ... and so on..
>
> Any ideas?
>
> regards, Frank
>


Re: __consumer_offsets topic acks

2016-12-17 Thread Ewen Cheslack-Postava
The default is -1 which means all replicas need to replicate the committed
data before the ack will be sent to the consumer. See
the offsets.commit.required.acks setting for the broker.

min.insync.replicas applies to the offsets topic as well, but defaults to
1. You may want to increase this (either on a per-topic basis or globally
if that's an acceptable possible availability tradeoff for you).

-Ewen

On Fri, Dec 16, 2016 at 6:15 PM, Fang Wong  wrote:

> Hi,
>
> What is the value of acks set for kafka internal topic __consumer_offsets?
> I know the default replication factor for __consumer_offsets is 3, and
> we are using version 0.9.01, and set min.sync.replicas = 2 in our
> server.properties.
> We noticed some partitions of __consumer_offsets has ISR with 1, some
> with 2 or 3.
>
> Thanks,
> Fang
>


Re: What does GetOffsetShell result represent

2016-12-17 Thread Ewen Cheslack-Postava
The tool writes output in the format:

::

So in the case of your example with --time -1 that returned
test-window-stream:0:724, it is saying that test-window-stream has
partition 0 with a valid log segment which has the first offset = 724. Note
that --time -1 is a special code for "only give the latest segment". For
the -2 option you saw 0:0 because the earliest segment is still currently
the first segment created for that topic, i.e. the one starting with offset
0 (because your retention policy hasn't kicked in yet). In cases where it
returned multiple comma separated values, those are multiple starting
offsets for segments containing offsets before the timestamp you requested.

Note that this is using the old simple consumer approach to querying for
offsets. That implementation could only tell you about offsets based on
entire log segments. If you're on a newer version of Kafka, you may be
interested in the time-based offset query, which you can access in the new
consumer API and gives you better granularity (http://kafka.apache.org/0101/
javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#
offsetsForTimes(java.util.Map)).

-Ewen

On Thu, Dec 15, 2016 at 3:52 AM, Sachin Mittal  wrote:

> Folks any explanation for this. Or any link that can help me on that.
>
> On Tue, Dec 13, 2016 at 1:00 PM, Sachin Mittal  wrote:
>
> > Hi,
> > I have some trouble interpreting the result of GetOffsetShell command.
> >
> > Say if I run
> > bin\windows\kafka-run-class.bat kafka.tools.GetOffsetShell --broker-list
> > localhost:9092 --topic test-window-stream --time -2
> > test-window-stream:0:0
> >
> > D:\kafka_2.10-0.10.2.0-SNAPSHOT>bin\windows\kafka-run-class.bat
> > kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic
> > test-window-stream --time -1
> > test-window-stream:0:724
> >
> > bin\windows\kafka-run-class.bat kafka.tools.GetOffsetShell --broker-list
> > localhost:9092 --topic test-window-stream
> > test-window-stream:0:724
> >
> > D:\kafka_2.10-0.10.2.0-SNAPSHOT>bin\windows\kafka-run-class.bat
> > kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic
> > test-window-stream --offset 2
> > test-window-stream:0:724,0
> >
> > So what does 0:0 mean or for that mater 0:724
> >
> > Also when I use  -2 (earliest), why does it show 0:0, does this means
> > offset if at the start of the topic.
> > Also when I use -1 (latest), it shows 0:724, does it means none of the
> > records are processed.
> > However I was having a stream application running which did aggregate
> > atleast  upto 720 records before I killed it. So why does it not show
> here?
> >
> > Also it is not clear what --offset flag means and when I pass something
> > greater than 1 I get ,0 appended to the earlier result.
> >
> > Thanks
> > Sachin
> >
> >
>


Re: Kafka connect distributed mode not distributing the work

2016-12-17 Thread Ewen Cheslack-Postava
Hi Manjunath,

I think you're seeing a case of this issue: https://issues.apache.
org/jira/browse/KAFKA-4553 where the way round robin assignment works with
an even # of workers and connectors that generate only 1 task generates
uneven work assignments because connectors aren't really equivalent to
tasks in practice (all the connectors are getting assigned to 1 worker, all
the tasks to the other).

I've filed a PR to address the issue, though it probably won't be available
until the next release (but it'll work automatically once you upgrade). The
easiest short term solution is to simply increase to 3 worker processes
since that'll avoid assigning all the connectors to the first worker and
all tasks to the second.

-Ewen

On Wed, Dec 14, 2016 at 8:59 PM, Manjunath Shetty H <
manjunathshe...@live.com> wrote:

> Hi all,
>
>
> I am running kafka connect using 2 node cluster. I have 5 connectors
> running with maxtask 1 each. But all the tasks are running in same node,
> work is not distributed across 2 nodes.
>
>
> I am using a custom source connectors.
>
>
> Any help is appreciated
>
>
> Thanks
>
> Manjunath
>


Re: Producer connect timeouts

2016-12-17 Thread Ewen Cheslack-Postava
Without having dug back into the code to check, this sounds right.
Connection management just fires off a request to connect and then
subsequent poll() calls will handle any successful/failed connections. The
timeouts wrt requests are handled somewhat differently (the connection
request isn't explicitly tied to the request that triggered it, so when the
latter times out, we don't follow up and timeout the connection request
either).

So yes, you currently will have connection requests tied to your underlying
TCP timeout request. This tends to be much more of a problem in public
clouds where the handshake request will be silently dropped due to firewall
rules.

The metadata.max.age.ms is a workable solution, but agreed that it's not
great. If possible, reducing the default TCP connection timeout isn't
unreasonable either -- the defaults are set for WAN connections (and
arguably set for WAN connections of long ago), so much more aggressive
timeouts are reasonable for Kafka clusters.

-Ewen

On Fri, Dec 16, 2016 at 1:41 PM, Luke Steensen  wrote:

> Hello,
>
> Is it correct that producers do not fail new connection establishment when
> it exceeds the request timeout?
>
> Running on AWS, we've encountered a problem where certain very low volume
> producers end up with metadata that's sufficiently stale that they attempt
> to establish a connection to a broker instance that has already been
> terminated as part of a maintenance operation. I would expect this to fail
> and be retried normally, but it appears to hang until the system-level TCP
> connection timeout is reached (2-3 minutes), with the writes themselves
> being expired before even a single attempt is made to send them.
>
> We've worked around the issue by setting `metadata.max.age.ms` extremely
> low, such that these producers are requesting new metadata much faster than
> our maintenance operations are terminating instances. While this does work,
> it seems like an unfortunate workaround for some very surprising behavior.
>
> Thanks,
> Luke
>


Re: How to disable auto commit for SimpleConsumer kafka 0.8.1

2016-12-10 Thread Ewen Cheslack-Postava
The simple consumer doesn't do auto-commit. It really only issues
individual low-level Kafka protocol request types, so `commitOffsets` is
the only way it should write offsets.

Is it possible it crashed after the request was sent but before finishing
reading the response?

Side-note: I know you mentioned 0.8.1, but if at all possible, we'd highly
recommend moving to the new consumer if at all possible. It supports both
simple and consumer group modes and is what will be supported in the long
term moving forward.

-Ewen

On Tue, Dec 6, 2016 at 12:47 PM, Anjani Gupta 
wrote:

> I want to disable auto commit for kafka SimpleConsumer. I am using 0.8.1
> version.For High level consumer, config options can be set and passed via
> consumerConfig as follows kafka.consumer.Consumer.
> createJavaConsumerConnector(this.consumerConfig);
>
> How can I achieve the same for SimpleConsumer? I mainly want to disable
> auto commit. I tried setting auto commit to false in consumer.properties
> and restarted kafka server, zookeeper and producer. But, that does not
> work. I think I need to apply this setting through code, not in
> consumer.properties. Can anyone help here?
>
> Here is how my code looks like
>
> List topicAndPartitionList = new ArrayList<>();
> topicAndPartitionList.add(topicAndPartition);
> OffsetFetchResponse offsetFetchResponse = consumer.fetchOffsets(new
>  OffsetFetchRequest("testGroup", topicAndPartitionList, (short) 0,
> correlationId,clientName));
>
> Map offsets =
> offsetFetchResponse.offsets();
> FetchRequest req = new FetchRequestBuilder() .clientId(clientName)
>.addFetch(a_topic, a_partition,
> offsets.get(topicAndPartition).offset(), 10)   .build();
> long readOffset = offsets.get(topicAndPartition).offset();
> FetchResponse fetchResponse = consumer.fetch(req);
>
> //Consume messages from fetchResponse
>
>
> Map requestInfo = new
> HashMap<>  ();
> requestInfo.put(topicAndPartition, new
> OffsetMetadataAndError(readOffset, "metadata", (short)0));
> OffsetCommitResponse offsetCommitResponse = consumer.commitOffsets(new
> OffsetCommitRequest("testGroup", requestInfo, (short)0,
> correlationId, clientName));
>
>
> If above code crashes before committing offset, I still get latest offset
> as result of offsets.get(topicAndPartition).offset() in next run which
> makes me to think that auto commit of offset happens as code is executed.
>



-- 
Thanks,
Ewen


Re: Best approach to frequently restarting consumer process

2016-12-10 Thread Ewen Cheslack-Postava
Consumer groups aren't going to handle 'let it crash' particularly well
(and really any session-based services, but particularly consumer groups
since a single failure affects the entire group). That said, 'let it crash'
doesn't necessarily have to mean 'don't try to clean up at all'. The
consumer group will recover *much* more quickly if you make sure any crash
path includes a:

finally {
   consumer.close();
}

block to do some minimal cleanup. This will cause the consumer to make a
best effort to explicitly leave the group, allowing rebalancing to complete
after the rest of the members rejoin. If you don't do this, your rebalances
get much more expensive since the group coordinator needs to wait for the
session timeout. This will probably notice to noticeably longer pauses. The
one drawback to doing this today is that the close() can potentially block,
so it may not fail as fast as you want it to -- it would be good to get a
timeout-based close() implemented as well. That said, the LeaveGroup
request *is* best effort, so if the consumer was otherwise in a healthy
state, this should be very fast.

All this said, 'let it crash' isn't the same thing as 'constant crashes are
ok'. It's a fault recovery methodology, but crashing every 5 minutes isn't
what the telecom industry had in mind... If things are crashing that
frequently, there is likely a very common bug/memory leak/etc which can be
fixed to significantly reduce the frequency of crashes. Generally 'let it
crash' systems also provide a good way to also collect debugging
information for exactly this purpose.

-Ewen

On Wed, Dec 7, 2016 at 1:38 AM, Harald Kirsch 
wrote:

> With 'restart' I mean a 'let it crash' setup (as promoted by Erlang and
> Akka, e.g. http://doc.akka.io/docs/akka/snapshot/intro/what-is-akka.html).
> The consumer gets in trouble due to an OOM or a runaway computation or
> whatever that we want to preempt somehow. It crashes or gets killed
> externally.
>
> So whether close() is called or not in the dying process, I don't know.
> But clearly the subscribe is called after a restart.
>
> I understand that we are out of luck with this. We would have to separate
> the crashing part out into a different operating system process, but must
> keep the consumer running all time. :-(
>
> Thanks for the insight
> Harald
>
>
> On 06.12.2016 19:26, Gwen Shapira wrote:
>
>> Can you clarify what you mean by "restart"? If you call
>> consumer.close() and consumer.subscribe() you will definitely trigger
>> a rebalance.
>>
>> It doesn't matter if its "same consumer knocking", we already
>> rebalance when you call consumer.close().
>>
>> Since we want both consumer.close() and consumer.subscribe() to cause
>> rebalance immediately (and not wait for heartbeat), I don't think
>> we'll be changing their behavior.
>>
>> Depending on why consumers need to restart, I'm wondering if you can
>> restart other threads in your application but keep the consumer up and
>> running to avoid the rebalances.
>>
>> On Tue, Dec 6, 2016 at 7:18 AM, Harald Kirsch 
>> wrote:
>>
>>> We have consumer processes which need to restart frequently, say, every 5
>>> minutes. We have 10 of them so we are facing two restarts every minute on
>>> average.
>>>
>>> 1) It seems that nearly every time a consumer restarts  the group is
>>> rebalanced. Even if the restart takes less than the heartbeat interval.
>>>
>>> 2) My guess is that the group manager just cannot know that the same
>>> consumer is knocking at the door again.
>>>
>>> Are my suspicions (1) and (2) correct? Is there a chance to fix this such
>>> that a restart within the heartbeat interval does not lead to a
>>> re-balance?
>>> Would a well defined client.id help?
>>>
>>> Regards
>>> Harald
>>>
>>>
>>
>>
>>


-- 
Thanks,
Ewen


Re: Configuration for low latency and low cpu utilization? java/librdkafka

2016-12-10 Thread Ewen Cheslack-Postava
On the producer side, there's not much you can do to reduce CPU usage if
you want low latency and don't have enough throughput to buffer multiple
messages -- you're going to end up sending 1 record at a time in order to
achieve your desired latency. Note, however, that the producer is thread
safe, so if it is possible to combine multiple processes into a single
multi-threaded app, you might be able to share a single producer and get
better batching.

One the consumer side, for the Java client fetch.min.bytes is already set
to 1, which will minimize latency -- data will be returned as soon as any
data is available. If you are consistently seeing poll() return no messages
in your consumers, try increasing fetch.max.wait.ms. It defaults to 500ms,
so I'm guessing you're not hitting this, but if your data is spread across
enough partitions and brokers, it's possible you are sending out a bunch of
fetch requests that aren't returning any data.

Also, as with producers, if you have light enough traffic you will benefit
by consolidating to fewer consumers if possible. Fetch requests are made
one at a time for *all* partitions the consumer is reading from that have
the same leader, which means you'll amortize the cost of requests over
multiple topic partitions (while maintaining the low latency guarantees
when traffic in all the partitions is light anyway).

Finally, as always, your best bet is to measure metrics & profile your app
to see where the CPU time is going.

-Ewen

On Thu, Dec 8, 2016 at 7:44 AM, Niklas Ström  wrote:

> Use case scenario:
> We want to have a fairly low latency, say below 20 ms, and we want to be
> able to run a few hundred processes (on one machine) both producing and
> consuming a handful of topics. The throughput is not high, lets say on
> average 10 messages per second for each process. Most messages are 50-500
> bytes large, some may be a few kbytes.
>
> How should we adjust the configuration parameters for our use case?
>
> Our experiments so far gives us a good latency but at the expence of CPU
> utilization. Even with a bad latency, the CPU utilization is not
> satisfying. Since we will have a lot of processes we are concerned that
> short poll loops will cause an overconsumption of CPU capacity. We are
> hoping we might have missed some configuration parameter or that we have
> some issues with our environment that we can find and solve.
>
> We are using both the java client and librdkafka and see similar CPU issues
> in both clients.
>
> We have looked at recommendations from:
> https://github.com/edenhill/librdkafka/wiki/How-to-
> decrease-message-latency
> The only thing that seems to really make a difference for librdkafka is
> socket.blocking.max.ms, but reducing that also makes the CPU go up.
>
> I would really appreciate input on configuration parameters and of any
> experience with environment issues that has caused CPU load. Or is our
> scenario not feasible at all?
>
> Cheers
>



-- 
Thanks,
Ewen


Re: Upgrading from 0.10.0.1 to 0.10.1.0

2016-12-10 Thread Ewen Cheslack-Postava
Hagen,

What does "new consumer doesn't like the old brokers" mean exactly?

When upgrading MM, remember that it uses the clients internally so the same
compatibility rules apply: you need to upgrade both sets of brokers before
you can start using the new version of MM.

-Ewen

On Thu, Dec 8, 2016 at 6:32 AM, Hagen Rother 
wrote:

> Hi,
>
> I am testing an upgrade and I am stuck on the mirror maker.
>
> - New consumer doesn't like the old brokers
> - Old consumer comes up, but does nothing and throws
> a java.net.SocketTimeoutException after while.
>
> What's the correct upgrade strategy when mirroring is used?
>
> Thanks!
> Hagen
>



-- 
Thanks,
Ewen


Re: NotEnoughReplication

2016-12-10 Thread Ewen Cheslack-Postava
This error doesn't necessarily mean that a broker is down, it can also mean
that too many replicas for that topic partition have fallen behind the
leader. This indicates your replication is lagging for some reason.

You'll want to be monitoring some of the metrics listed here:
http://kafka.apache.org/documentation.html#monitoring to help you
understand a) when this occurs (e.g. # of under replicated partitions being
a critical one) and b) what the cause might be (e.g. saturating network,
requests processing slow due to some other resource contention, etc).

-Ewen

On Fri, Dec 9, 2016 at 5:20 PM, Mohit Anchlia 
wrote:

> What's the best way to fix NotEnoughReplication given all the nodes are up
> and running? Zookeeper did go down momentarily. We are on Kafka 0.10
>
> org.apache.kafka.common.errors.NotEnoughReplicasException: Number of
> insync
> replicas for partition [__consumer_offsets,20] is [1], below required
> minimum [2]
>



-- 
Thanks,
Ewen


Re: Running mirror maker between two different version of kafka

2016-12-10 Thread Ewen Cheslack-Postava
It's tough to read that stacktrace, but if I understand what you mean by
"running the kafka mirroring in destination cluster which is 0.10.1.0
version of kafka", then the problem is that you cannot use 0.10.1.0 mirror
maker with an 0.8.1. cluster. MirrorMaker is both a producer and consumer,
so the version used has to be client-compatible with both source and
destination clusters. For Kafka that means the client (MM) can use at most
version min(src cluster version, dest cluster version).

-Ewen

On Thu, Dec 8, 2016 at 6:24 PM, Vijayanand Rengarajan <
vijayanand.rengara...@yahoo.com.invalid> wrote:

> Team,
> I am trying to mirror few topics from cluster A( version 0.8.1) to Cluster
> B (version 0.10.1.0), but due to version incompatibility I am getting below
> error.if any one of you had similar issues, please share the work
> around/solution to this issue.
> I am running the kafka mirroring in destination cluster which is 0.10.1.0
> version of kafka  installed.
> There is no firewall and iptables between these two clusters.
>  WARN [ConsumerFetcherThread-console-consumer-27615_
> kafkanode01-1481247967907-68767097-0-30], Error in fetch kafka.consumer.
> ConsumerFetcherThread$FetchRequest@26902baa (kafka.consumer.
> ConsumerFetcherThread)java.io.EOFException at org.apache.kafka.common.
> network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:83) at
> kafka.network.BlockingChannel.readCompletely(BlockingChannel.scala:129)
> at kafka.network.BlockingChannel.receive(BlockingChannel.scala:120) at
> kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:99) at
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$
> $sendRequest(SimpleConsumer.scala:83) at kafka.consumer.SimpleConsumer$
> $anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:132)
> at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$
> apply$mcV$sp$1.apply(SimpleConsumer.scala:132) at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$
> apply$mcV$sp$1.apply(SimpleConsumer.scala:132) at
> kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:131)
> at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:131)
> at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:131)
> at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at
> kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:130) at
> kafka.consumer.ConsumerFetcherThread.fetch(ConsumerFetcherThread.scala:109)
> at kafka.consumer.ConsumerFetcherThread.fetch(ConsumerFetcherThread.scala:29)
> at kafka.server.AbstractFetcherThread.processFetchRequest(
> AbstractFetcherThread.scala:118) at kafka.server.
> AbstractFetcherThread.doWork(AbstractFetcherThread.scala:103) at
> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
>
> Thanks, vijayanand.




-- 
Thanks,
Ewen


  1   2   3   4   >