Re: Jars in Kafka 0.10

2016-08-02 Thread Ewen Cheslack-Postava
This is a combination of Connect and Streams, and actually probably more
related to Connect because it pulls in a bunch of jars to implement its
REST API. You don't need all of them to run any of the individual
components, but they are currently bundled altogether in a way that is,
unfortunately, not easy to detangle. Note that you won't be including all
of these if you're using any of the client libraries via Maven dependencies
(which includes producer, consumer, or streams).

On Fri, Jul 29, 2016 at 7:28 AM, Gerard Klijs 
wrote:

> No, if you don't use streams you don't need them. If you have no clients
> (so also no mirror maker) running on the same machine you also don't need
> the client jar, if you run zookeeper separately you also don't need those.
>
> On Fri, Jul 29, 2016 at 4:22 PM Bhuvaneswaran Gopalasami <
> bhuvanragha...@gmail.com> wrote:
>
> > I have recently started looking into Kafka I noticed the number of Jars
> in
> > Kafka 0.10 has increased when compared to 0.8.2. Do we really need all
> > those libraries to run Kafka ?
> >
> > Thanks,
> > Bhuvanes
> >
>



-- 
Thanks,
Ewen


[jira] [Commented] (KAFKA-4010) ConfigDef.toRst() should create sections for each group

2016-08-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405270#comment-15405270
 ] 

ASF GitHub Bot commented on KAFKA-4010:
---

GitHub user rekhajoshm reopened a pull request:

https://github.com/apache/kafka/pull/1696

KAFKA-4010; ConfigDef.toRst() to have grouped sections with dependents info

- Added sort method with group order
- Added dependents info 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rekhajoshm/kafka KAFKA-4010

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1696.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1696


commit 630fd4c5220ca5c934492318037f8a493a305b01
Author: Joshi 
Date:   2016-08-03T04:08:41Z

KAFKA-4010; ConfigDef.toEnrichedRst() to have grouped sections with 
dependents info




> ConfigDef.toRst() should create sections for each group
> ---
>
> Key: KAFKA-4010
> URL: https://issues.apache.org/jira/browse/KAFKA-4010
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Shikhar Bhushan
>Assignee: Rekha Joshi
>Priority: Minor
>
> Currently the ordering seems a bit arbitrary. There is a logical grouping 
> that connectors are now able to specify with the 'group' field, which we 
> should use as section headers. Also it would be good to generate {{:ref:}} 
> for each section.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1696: KAFKA-4010; ConfigDef.toRst() to have grouped sect...

2016-08-02 Thread rekhajoshm
GitHub user rekhajoshm reopened a pull request:

https://github.com/apache/kafka/pull/1696

KAFKA-4010; ConfigDef.toRst() to have grouped sections with dependents info

- Added sort method with group order
- Added dependents info 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rekhajoshm/kafka KAFKA-4010

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1696.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1696


commit 630fd4c5220ca5c934492318037f8a493a305b01
Author: Joshi 
Date:   2016-08-03T04:08:41Z

KAFKA-4010; ConfigDef.toEnrichedRst() to have grouped sections with 
dependents info




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4010) ConfigDef.toRst() should create sections for each group

2016-08-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405266#comment-15405266
 ] 

ASF GitHub Bot commented on KAFKA-4010:
---

Github user rekhajoshm closed the pull request at:

https://github.com/apache/kafka/pull/1696


> ConfigDef.toRst() should create sections for each group
> ---
>
> Key: KAFKA-4010
> URL: https://issues.apache.org/jira/browse/KAFKA-4010
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Shikhar Bhushan
>Assignee: Rekha Joshi
>Priority: Minor
>
> Currently the ordering seems a bit arbitrary. There is a logical grouping 
> that connectors are now able to specify with the 'group' field, which we 
> should use as section headers. Also it would be good to generate {{:ref:}} 
> for each section.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1696: KAFKA-4010; ConfigDef.toRst() to have grouped sect...

2016-08-02 Thread rekhajoshm
Github user rekhajoshm closed the pull request at:

https://github.com/apache/kafka/pull/1696


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3999) Consumer bytes-fetched metric uses decompressed message size

2016-08-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405222#comment-15405222
 ] 

ASF GitHub Bot commented on KAFKA-3999:
---

Github user vahidhashemian closed the pull request at:

https://github.com/apache/kafka/pull/1698


> Consumer bytes-fetched metric uses decompressed message size
> 
>
> Key: KAFKA-3999
> URL: https://issues.apache.org/jira/browse/KAFKA-3999
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.1, 0.10.0.0
>Reporter: Jason Gustafson
>Assignee: Vahid Hashemian
>Priority: Minor
>
> It looks like the computation for the bytes-fetched metrics uses the size of 
> the decompressed message set. I would have expected it to be based off of the 
> raw size of the fetch responses. Perhaps it would be helpful to expose both 
> the raw and decompressed fetch sizes? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3999) Consumer bytes-fetched metric uses decompressed message size

2016-08-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405223#comment-15405223
 ] 

ASF GitHub Bot commented on KAFKA-3999:
---

GitHub user vahidhashemian reopened a pull request:

https://github.com/apache/kafka/pull/1698

KAFKA-3999: Record raw size of fetch responses

Currently, only the decompressed size of fetch responses is recorded. This 
PR adds a sensor to keep track of the raw size as well.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vahidhashemian/kafka KAFKA-3999

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1698.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1698


commit ae9b0abf60e20923761eda7d5f351625a2b2c0dc
Author: Vahid Hashemian 
Date:   2016-08-02T23:07:27Z

KAFKA-3999: Record raw size of fetch responses

Currently, only the decompressed size of fetch responses is recorded. This 
PR adds a sensor to keep track of their raw size as well.




> Consumer bytes-fetched metric uses decompressed message size
> 
>
> Key: KAFKA-3999
> URL: https://issues.apache.org/jira/browse/KAFKA-3999
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.1, 0.10.0.0
>Reporter: Jason Gustafson
>Assignee: Vahid Hashemian
>Priority: Minor
>
> It looks like the computation for the bytes-fetched metrics uses the size of 
> the decompressed message set. I would have expected it to be based off of the 
> raw size of the fetch responses. Perhaps it would be helpful to expose both 
> the raw and decompressed fetch sizes? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1698: KAFKA-3999: Record raw size of fetch responses

2016-08-02 Thread vahidhashemian
Github user vahidhashemian closed the pull request at:

https://github.com/apache/kafka/pull/1698


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #1698: KAFKA-3999: Record raw size of fetch responses

2016-08-02 Thread vahidhashemian
GitHub user vahidhashemian reopened a pull request:

https://github.com/apache/kafka/pull/1698

KAFKA-3999: Record raw size of fetch responses

Currently, only the decompressed size of fetch responses is recorded. This 
PR adds a sensor to keep track of the raw size as well.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vahidhashemian/kafka KAFKA-3999

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1698.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1698


commit ae9b0abf60e20923761eda7d5f351625a2b2c0dc
Author: Vahid Hashemian 
Date:   2016-08-02T23:07:27Z

KAFKA-3999: Record raw size of fetch responses

Currently, only the decompressed size of fetch responses is recorded. This 
PR adds a sensor to keep track of their raw size as well.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-66 Kafka Connect Transformers for messages

2016-08-02 Thread Ewen Cheslack-Postava
On Thu, Jul 28, 2016 at 11:58 PM, Shikhar Bhushan 
wrote:

> >
> >
> > Hmm, operating on ConnectRecords probably doesn't work since you need to
> > emit the right type of record, which might mean instantiating a new one.
> I
> > think that means we either need 2 methods, one for SourceRecord, one for
> > SinkRecord, or we'd need to limit what parts of the message you can
> modify
> > (e.g. you can change the key/value via something like
> > transformKey(ConnectRecord) and transformValue(ConnectRecord), but other
> > fields would remain the same and the fmwk would handle allocating new
> > Source/SinkRecords if needed)
> >
>
> Good point, perhaps we could add an abstract method on ConnectRecord that
> takes all the shared fields as parameters and the implementations return a
> copy of the narrower SourceRecord/SinkRecord type as appropriate.
> Transformers would only operate on ConnectRecord rather than caring about
> SourceRecord or SinkRecord (in theory they could instanceof/cast, but the
> API should discourage it)
>
>
> > Is there a use case for hanging on to the original? I can't think of a
> > transformation where you'd need to do that (or couldn't just order things
> > differently so it isn't a problem).
>
>
> Yeah maybe this isn't really necessary. No strong preference here.
>
> That said, I do worry a bit that farming too much stuff out to transformers
> > can result in "programming via config", i.e. a lot of the simplicity you
> > get from Connect disappears in long config files. Standardization would
> be
> > nice and might just avoid this (and doesn't cost that much implementing
> it
> > in each connector), and I'd personally prefer something a bit less
> flexible
> > but consistent and easy to configure.
>
>
> Not sure what the you're suggesting :-) Standardized config properties for
> a small set of transformations, leaving it upto connectors to integrate?
>

I just mean that you get to the point where you're practically writing a
Kafka Streams application, you're just doing it through either an
incredibly convoluted set of transformers and configs, or a single
transformer with incredibly convoluted set of configs. You basically get to
the point where you're config is a mini DSL and you're not really saving
that much.

The real question is how much we want to venture into the "T" part of ETL.
I tend to favor minimizing how much we take on since the rest of Connect
isn't designed for it, it's designed around the E & L parts.

-Ewen


> Personally I'm skeptical of that level of flexibility in transformers --
> > its getting awfully complex and certainly takes us pretty far from
> "config
> > only" realtime data integration. It's not clear to me what the use cases
> > are that aren't covered by a small set of common transformations that can
> > be chained together (e.g. rename/remove fields, mask values, and maybe a
> > couple more).
> >
>
> I agree that we should have some standard transformations that we ship with
> connect that users would ideally lean towards for routine tasks. The ones
> you mention are some good candidates where I'd imagine can expose simple
> config, e.g.
>transform.filter.whitelist=x,y,z # filter to a whitelist of fields
>transfom.rename.spec=oldName1=>newName1, oldName2=>newName2
>topic.rename.replace=-/_
>topic.rename.prefix=kafka_
> etc..
>
> However the ecosystem will invariably have more complex transformers if we
> make this pluggable. And because ETL is messy, that's probably a good thing
> if folks are able to do their data munging orthogonally to connectors, so
> that connectors can focus on the logic of how data should be copied from/to
> datastores and Kafka.
>
>
> > In any case, we'd probably also have to change configs of connectors if
> we
> > allowed configs like that since presumably transformer configs will just
> be
> > part of the connector config.
> >
>
> Yeah, haven't thought much about how all the configuration would tie
> together...
>
> I think we'd need the ability to:
> - spec transformer chain (fully-qualified class names? perhaps special
> aliases for built-in ones? perhaps third-party fqcns can be assigned
> aliases by users in the chain spec, for easier configuration and to
> uniquely identify a transformation when it occurs more than one time in a
> chain?)
> - configure each transformer -- all properties prefixed with that
> transformer's ID (fqcn / alias) get destined to it
>
> Additionally, I think we would probably want to allow for topic-specific
> overrides  (e.g. you
> want
> certain transformations for one topic, but different ones for another...)
>



-- 
Thanks,
Ewen


Jenkins build is back to normal : kafka-trunk-jdk8 #795

2016-08-02 Thread Apache Jenkins Server
See 



Re: JDK configuration for Kafka jobs in Jenkins

2016-08-02 Thread Guozhang Wang
Thanks for the update Ismael.

Guozhang

On Mon, Aug 1, 2016 at 4:48 AM, Ismael Juma  wrote:

> Hi all,
>
> Just a quick update with regards to the JDK configuration for Kafka Jobs in
> Jenkins. The Infra team has made some changes on how the JDK is installed
> in Jenkins slaves and how it should be configured in Jenkins jobs. See the
> following for details:
>
>
> https://mail-archives.apache.org/mod_mbox/www-builds/201608.mbox/%3CCAN0Gg1eNFn9FP_mdyQBB_9gWHg87B9sjwQ82JbWtkGob42%2B5%2Bw%40mail.gmail.com%3E
>
> I have updated the Kafka Jenkins jobs to use the new configuration options.
> JDK 7 jobs now use "JDK 1.7 (latest)" (jdk1.7.0_80) and JDK 8 jobs now use
> "JDK
> 1.8 (latest)" (jdk1.8.0_102). Updates within the same major JDK version
> will be automatic.
>
> Ismael
>



-- 
-- Guozhang


[jira] [Work started] (KAFKA-3989) Add JMH module for Benchmarks

2016-08-02 Thread Bill Bejeck (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-3989 started by Bill Bejeck.
--
> Add JMH module for Benchmarks
> -
>
> Key: KAFKA-3989
> URL: https://issues.apache.org/jira/browse/KAFKA-3989
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Bill Bejeck
>Assignee: Bill Bejeck
>
> JMH is a Java harness for building, running, and analyzing benchmarks written 
> in Java or JVM languages.  To run properly JMH needs to be in it's own 
> module.   This task will also investigate using the jmh -gradle pluging 
> [https://github.com/melix/jmh-gradle-plugin] which enables the use of JMH 
> from gradle.  This is related to 
> [https://issues.apache.org/jira/browse/KAFKA-3973]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3999) Consumer bytes-fetched metric uses decompressed message size

2016-08-02 Thread Vahid Hashemian (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vahid Hashemian updated KAFKA-3999:
---
Status: Patch Available  (was: Open)

> Consumer bytes-fetched metric uses decompressed message size
> 
>
> Key: KAFKA-3999
> URL: https://issues.apache.org/jira/browse/KAFKA-3999
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.0.0, 0.9.0.1
>Reporter: Jason Gustafson
>Assignee: Vahid Hashemian
>Priority: Minor
>
> It looks like the computation for the bytes-fetched metrics uses the size of 
> the decompressed message set. I would have expected it to be based off of the 
> raw size of the fetch responses. Perhaps it would be helpful to expose both 
> the raw and decompressed fetch sizes? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1698: KAFKA-3999: Record raw size of fetch responses

2016-08-02 Thread vahidhashemian
GitHub user vahidhashemian opened a pull request:

https://github.com/apache/kafka/pull/1698

KAFKA-3999: Record raw size of fetch responses

Currently, only the decompressed size of fetch responses is recorded. This 
PR adds a sensor to keep track of the raw size as well.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vahidhashemian/kafka KAFKA-3999

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1698.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1698


commit ae9b0abf60e20923761eda7d5f351625a2b2c0dc
Author: Vahid Hashemian 
Date:   2016-08-02T23:07:27Z

KAFKA-3999: Record raw size of fetch responses

Currently, only the decompressed size of fetch responses is recorded. This 
PR adds a sensor to keep track of their raw size as well.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3999) Consumer bytes-fetched metric uses decompressed message size

2016-08-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404984#comment-15404984
 ] 

ASF GitHub Bot commented on KAFKA-3999:
---

GitHub user vahidhashemian opened a pull request:

https://github.com/apache/kafka/pull/1698

KAFKA-3999: Record raw size of fetch responses

Currently, only the decompressed size of fetch responses is recorded. This 
PR adds a sensor to keep track of the raw size as well.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vahidhashemian/kafka KAFKA-3999

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1698.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1698


commit ae9b0abf60e20923761eda7d5f351625a2b2c0dc
Author: Vahid Hashemian 
Date:   2016-08-02T23:07:27Z

KAFKA-3999: Record raw size of fetch responses

Currently, only the decompressed size of fetch responses is recorded. This 
PR adds a sensor to keep track of their raw size as well.




> Consumer bytes-fetched metric uses decompressed message size
> 
>
> Key: KAFKA-3999
> URL: https://issues.apache.org/jira/browse/KAFKA-3999
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.1, 0.10.0.0
>Reporter: Jason Gustafson
>Assignee: Vahid Hashemian
>Priority: Minor
>
> It looks like the computation for the bytes-fetched metrics uses the size of 
> the decompressed message set. I would have expected it to be based off of the 
> raw size of the fetch responses. Perhaps it would be helpful to expose both 
> the raw and decompressed fetch sizes? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-3999) Consumer bytes-fetched metric uses decompressed message size

2016-08-02 Thread Vahid Hashemian (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vahid Hashemian reassigned KAFKA-3999:
--

Assignee: Vahid Hashemian

> Consumer bytes-fetched metric uses decompressed message size
> 
>
> Key: KAFKA-3999
> URL: https://issues.apache.org/jira/browse/KAFKA-3999
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.1, 0.10.0.0
>Reporter: Jason Gustafson
>Assignee: Vahid Hashemian
>Priority: Minor
>
> It looks like the computation for the bytes-fetched metrics uses the size of 
> the decompressed message set. I would have expected it to be based off of the 
> raw size of the fetch responses. Perhaps it would be helpful to expose both 
> the raw and decompressed fetch sizes? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4000) Consumer per-topic metrics do not aggregate partitions from the same topic

2016-08-02 Thread Vahid Hashemian (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vahid Hashemian updated KAFKA-4000:
---
Status: Patch Available  (was: In Progress)

> Consumer per-topic metrics do not aggregate partitions from the same topic
> --
>
> Key: KAFKA-4000
> URL: https://issues.apache.org/jira/browse/KAFKA-4000
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.0.0, 0.9.0.1
>Reporter: Jason Gustafson
>Assignee: Vahid Hashemian
>Priority: Minor
>
> In the Consumer Fetcher code, we have per-topic fetch metrics, but they seem 
> to be computed from each partition separately. It seems like we should 
> aggregate them by topic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4014) Use Collections.singletonList instead of Arrays.asList for single arguments

2016-08-02 Thread Rekha Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rekha Joshi updated KAFKA-4014:
---
Summary: Use Collections.singletonList instead of Arrays.asList for single 
arguments  (was: Use Collections.singletonList instead of Arrays.asList for 
unmodified single arguments)

> Use Collections.singletonList instead of Arrays.asList for single arguments
> ---
>
> Key: KAFKA-4014
> URL: https://issues.apache.org/jira/browse/KAFKA-4014
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.9.0.1
>Reporter: Rekha Joshi
>Assignee: Rekha Joshi
>Priority: Minor
>
> Usage of Collections.singletonList instead of Arrays.asList for single 
> arguments better



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3042) updateIsr should stop after failed several times due to zkVersion issue

2016-08-02 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404901#comment-15404901
 ] 

Jun Rao commented on KAFKA-3042:


[~onurkaraman], there are a couple of things.

1. Currently when a broker starts up, it expects the very first 
LeaderAndIsrRequest to contain all the partitions hosted on this broker. After 
that, we read the last checkpointed high watermark and start the high watermark 
checkpoint thread. If we combine UpdateMetadataRequest and LeaderAndIsrRequest, 
the very first request that a broker receives could be an UpdateMetadataRequest 
including partitions not hosted on this broker. Then, we may checkpoint high 
watermarks on incorrect partitions.

2. Currently, LeaderAndIsrRequest is used to inform replicas about the new 
leader and is only sent to brokers storing the partition. UpdateMetadataRequest 
is used for updating the metadata cache for the clients and is sent to every 
broker. Technically, they are for different things. So, using separate requests 
makes logical sense. We could use a single request to do both. Not sure if this 
makes it clearer or more confusing from a debugging perspective. In any case, 
there will be significant code changes to do this. I am not opposed to that. I 
just think that if we want to do that, we probably want to think through how to 
improve the controller logic holistically since there are other known pain 
points in the controller.


> updateIsr should stop after failed several times due to zkVersion issue
> ---
>
> Key: KAFKA-3042
> URL: https://issues.apache.org/jira/browse/KAFKA-3042
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
> Environment: jdk 1.7
> centos 6.4
>Reporter: Jiahongchao
> Fix For: 0.10.1.0
>
> Attachments: controller.log, server.log.2016-03-23-01, 
> state-change.log
>
>
> sometimes one broker may repeatly log
> "Cached zkVersion 54 not equal to that in zookeeper, skip updating ISR"
> I think this is because the broker consider itself as the leader in fact it's 
> a follower.
> So after several failed tries, it need to find out who is the leader



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4014) Use Collections.singletonList instead of Arrays.asList for single arguments

2016-08-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404900#comment-15404900
 ] 

ASF GitHub Bot commented on KAFKA-4014:
---

GitHub user rekhajoshm opened a pull request:

https://github.com/apache/kafka/pull/1697

KAFKA-4014; Using Collections.singletonList instead of Arrays.asList for 
single arguments

Using Collections.singletonList instead of Arrays.asList for single 
arguments

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rekhajoshm/kafka KAFKA-4014

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1697.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1697


commit c9a66992b1095616f87c5748f210b973ebc7eb01
Author: Rekha Joshi 
Date:   2016-05-26T17:48:37Z

Merge pull request #2 from apache/trunk

Apache Kafka trunk pull

commit 8d7fb005cb132440e7768a5b74257d2598642e0f
Author: Rekha Joshi 
Date:   2016-05-30T21:37:43Z

Merge pull request #3 from apache/trunk

Apache Kafka trunk pull

commit fbef9a8fb1411282fbadec46955691c3e7ba2578
Author: Rekha Joshi 
Date:   2016-06-04T23:58:02Z

Merge pull request #4 from apache/trunk

Apache Kafka trunk pull

commit 172db701bf9affda1304b684921260d1cd36ae9e
Author: Rekha Joshi 
Date:   2016-06-06T22:10:31Z

Merge pull request #6 from apache/trunk

Apache Kafka trunk pull

commit 9d18d93745cf2bc9b0ab4bb9b25d9a31196ef918
Author: Rekha Joshi 
Date:   2016-06-07T19:36:45Z

Merge pull request #7 from apache/trunk

Apache trunk pull

commit 882faea01f28aef1977f4ced6567833bcf736840
Author: Rekha Joshi 
Date:   2016-06-13T20:01:43Z

Merge pull request #8 from confluentinc/trunk

Apache kafka trunk pull

commit 851315d39c0c308d79b9575546822aa932c46a09
Author: Rekha Joshi 
Date:   2016-06-27T17:34:54Z

Merge pull request #9 from apache/trunk

Merge Apache kafka trunk

commit 613f07c2b4193302c82a5d6eaa1e53e4b87bfbc1
Author: Rekha Joshi 
Date:   2016-07-09T17:03:45Z

Merge pull request #11 from apache/trunk

Merge Apache kafka trunk

commit 150e46e462cc192fb869e633f6d9ab681e7b83f9
Author: Rekha Joshi 
Date:   2016-08-02T19:44:09Z

Merge pull request #12 from apache/trunk

Apache Kafka trunk pull

commit df223ee5a8c65d5e699b7c51f45df794c7acde9e
Author: Joshi 
Date:   2016-08-02T22:10:18Z

KAFKA-4014; Using Collections.singletonList instead of Arrays.asList for 
single arguments




> Use Collections.singletonList instead of Arrays.asList for single arguments
> ---
>
> Key: KAFKA-4014
> URL: https://issues.apache.org/jira/browse/KAFKA-4014
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.9.0.1
>Reporter: Rekha Joshi
>Assignee: Rekha Joshi
>Priority: Minor
>
> Usage of Collections.singletonList instead of Arrays.asList for single 
> arguments better for performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1697: KAFKA-4014; Using Collections.singletonList instea...

2016-08-02 Thread rekhajoshm
GitHub user rekhajoshm opened a pull request:

https://github.com/apache/kafka/pull/1697

KAFKA-4014; Using Collections.singletonList instead of Arrays.asList for 
single arguments

Using Collections.singletonList instead of Arrays.asList for single 
arguments

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rekhajoshm/kafka KAFKA-4014

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1697.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1697


commit c9a66992b1095616f87c5748f210b973ebc7eb01
Author: Rekha Joshi 
Date:   2016-05-26T17:48:37Z

Merge pull request #2 from apache/trunk

Apache Kafka trunk pull

commit 8d7fb005cb132440e7768a5b74257d2598642e0f
Author: Rekha Joshi 
Date:   2016-05-30T21:37:43Z

Merge pull request #3 from apache/trunk

Apache Kafka trunk pull

commit fbef9a8fb1411282fbadec46955691c3e7ba2578
Author: Rekha Joshi 
Date:   2016-06-04T23:58:02Z

Merge pull request #4 from apache/trunk

Apache Kafka trunk pull

commit 172db701bf9affda1304b684921260d1cd36ae9e
Author: Rekha Joshi 
Date:   2016-06-06T22:10:31Z

Merge pull request #6 from apache/trunk

Apache Kafka trunk pull

commit 9d18d93745cf2bc9b0ab4bb9b25d9a31196ef918
Author: Rekha Joshi 
Date:   2016-06-07T19:36:45Z

Merge pull request #7 from apache/trunk

Apache trunk pull

commit 882faea01f28aef1977f4ced6567833bcf736840
Author: Rekha Joshi 
Date:   2016-06-13T20:01:43Z

Merge pull request #8 from confluentinc/trunk

Apache kafka trunk pull

commit 851315d39c0c308d79b9575546822aa932c46a09
Author: Rekha Joshi 
Date:   2016-06-27T17:34:54Z

Merge pull request #9 from apache/trunk

Merge Apache kafka trunk

commit 613f07c2b4193302c82a5d6eaa1e53e4b87bfbc1
Author: Rekha Joshi 
Date:   2016-07-09T17:03:45Z

Merge pull request #11 from apache/trunk

Merge Apache kafka trunk

commit 150e46e462cc192fb869e633f6d9ab681e7b83f9
Author: Rekha Joshi 
Date:   2016-08-02T19:44:09Z

Merge pull request #12 from apache/trunk

Apache Kafka trunk pull

commit df223ee5a8c65d5e699b7c51f45df794c7acde9e
Author: Joshi 
Date:   2016-08-02T22:10:18Z

KAFKA-4014; Using Collections.singletonList instead of Arrays.asList for 
single arguments




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-4014) Use Collections.singletonList instead of Arrays.asList for single arguments

2016-08-02 Thread Rekha Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rekha Joshi updated KAFKA-4014:
---
Description: Usage of Collections.singletonList instead of Arrays.asList 
for single arguments better for performance  (was: Usage of 
Collections.singletonList instead of Arrays.asList for single arguments better)

> Use Collections.singletonList instead of Arrays.asList for single arguments
> ---
>
> Key: KAFKA-4014
> URL: https://issues.apache.org/jira/browse/KAFKA-4014
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.9.0.1
>Reporter: Rekha Joshi
>Assignee: Rekha Joshi
>Priority: Minor
>
> Usage of Collections.singletonList instead of Arrays.asList for single 
> arguments better for performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4014) Use Collections.singletonList instead of Arrays.asList for unmodified single arguments

2016-08-02 Thread Rekha Joshi (JIRA)
Rekha Joshi created KAFKA-4014:
--

 Summary: Use Collections.singletonList instead of Arrays.asList 
for unmodified single arguments
 Key: KAFKA-4014
 URL: https://issues.apache.org/jira/browse/KAFKA-4014
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 0.9.0.1
Reporter: Rekha Joshi
Priority: Minor


Usage of Collections.singletonList instead of Arrays.asList for single 
arguments better



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4014) Use Collections.singletonList instead of Arrays.asList for unmodified single arguments

2016-08-02 Thread Rekha Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rekha Joshi updated KAFKA-4014:
---
Issue Type: Improvement  (was: Bug)

> Use Collections.singletonList instead of Arrays.asList for unmodified single 
> arguments
> --
>
> Key: KAFKA-4014
> URL: https://issues.apache.org/jira/browse/KAFKA-4014
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.9.0.1
>Reporter: Rekha Joshi
>Priority: Minor
>
> Usage of Collections.singletonList instead of Arrays.asList for single 
> arguments better



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-4014) Use Collections.singletonList instead of Arrays.asList for unmodified single arguments

2016-08-02 Thread Rekha Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rekha Joshi reassigned KAFKA-4014:
--

Assignee: Rekha Joshi

> Use Collections.singletonList instead of Arrays.asList for unmodified single 
> arguments
> --
>
> Key: KAFKA-4014
> URL: https://issues.apache.org/jira/browse/KAFKA-4014
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.9.0.1
>Reporter: Rekha Joshi
>Assignee: Rekha Joshi
>Priority: Minor
>
> Usage of Collections.singletonList instead of Arrays.asList for single 
> arguments better



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1674: HOTFIX: Fixes to javadoc and to state store name f...

2016-08-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1674


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #1690: MINOR: Use `close()` instead of `dispose()` in var...

2016-08-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1690


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #1688: MINOR: Remove unnecessary synchronized block in or...

2016-08-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1688


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3929) Add prefix for underlying clients configs in StreamConfig

2016-08-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404814#comment-15404814
 ] 

ASF GitHub Bot commented on KAFKA-3929:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1649


> Add prefix for underlying clients configs in StreamConfig
> -
>
> Key: KAFKA-3929
> URL: https://issues.apache.org/jira/browse/KAFKA-3929
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Damian Guy
>  Labels: api
> Fix For: 0.10.1.0
>
>
> There are a couple of configs that have the same name for producer / consumer 
> configs, e.g. take a look at {{CommonClientConfigs}}, and also for producer / 
> consumer interceptors there are commonly named configs as well.
> This is semi-related to KAFKA-3740 since we need to add "sub-class" configs 
> for RocksDB as well, and we'd better have some prefix mechanism for such 
> hierarchical configs in general.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4010) ConfigDef.toRst() should create sections for each group

2016-08-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404813#comment-15404813
 ] 

ASF GitHub Bot commented on KAFKA-4010:
---

GitHub user rekhajoshm opened a pull request:

https://github.com/apache/kafka/pull/1696

KAFKA-4010; ConfigDef.toRst() to have grouped sections with dependents info

- Added sort method with group order
- Added dependents info 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rekhajoshm/kafka KAFKA-4010

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1696.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1696


commit c9a66992b1095616f87c5748f210b973ebc7eb01
Author: Rekha Joshi 
Date:   2016-05-26T17:48:37Z

Merge pull request #2 from apache/trunk

Apache Kafka trunk pull

commit 8d7fb005cb132440e7768a5b74257d2598642e0f
Author: Rekha Joshi 
Date:   2016-05-30T21:37:43Z

Merge pull request #3 from apache/trunk

Apache Kafka trunk pull

commit fbef9a8fb1411282fbadec46955691c3e7ba2578
Author: Rekha Joshi 
Date:   2016-06-04T23:58:02Z

Merge pull request #4 from apache/trunk

Apache Kafka trunk pull

commit 172db701bf9affda1304b684921260d1cd36ae9e
Author: Rekha Joshi 
Date:   2016-06-06T22:10:31Z

Merge pull request #6 from apache/trunk

Apache Kafka trunk pull

commit 9d18d93745cf2bc9b0ab4bb9b25d9a31196ef918
Author: Rekha Joshi 
Date:   2016-06-07T19:36:45Z

Merge pull request #7 from apache/trunk

Apache trunk pull

commit 882faea01f28aef1977f4ced6567833bcf736840
Author: Rekha Joshi 
Date:   2016-06-13T20:01:43Z

Merge pull request #8 from confluentinc/trunk

Apache kafka trunk pull

commit 851315d39c0c308d79b9575546822aa932c46a09
Author: Rekha Joshi 
Date:   2016-06-27T17:34:54Z

Merge pull request #9 from apache/trunk

Merge Apache kafka trunk

commit 613f07c2b4193302c82a5d6eaa1e53e4b87bfbc1
Author: Rekha Joshi 
Date:   2016-07-09T17:03:45Z

Merge pull request #11 from apache/trunk

Merge Apache kafka trunk

commit 150e46e462cc192fb869e633f6d9ab681e7b83f9
Author: Rekha Joshi 
Date:   2016-08-02T19:44:09Z

Merge pull request #12 from apache/trunk

Apache Kafka trunk pull

commit 00e70da51f03b23877e4e23e7e1500a1c9d7d20a
Author: Joshi 
Date:   2016-08-02T21:17:17Z

KAFKA-4010: ConfigDef.toRst() to have grouped sections with dependents info




> ConfigDef.toRst() should create sections for each group
> ---
>
> Key: KAFKA-4010
> URL: https://issues.apache.org/jira/browse/KAFKA-4010
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Shikhar Bhushan
>Assignee: Rekha Joshi
>Priority: Minor
>
> Currently the ordering seems a bit arbitrary. There is a logical grouping 
> that connectors are now able to specify with the 'group' field, which we 
> should use as section headers. Also it would be good to generate {{:ref:}} 
> for each section.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1649: KAFKA-3929: Add prefix for underlying clients conf...

2016-08-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1649


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2630) Add Namespaces to Kafka

2016-08-02 Thread Vahid Hashemian (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404811#comment-15404811
 ] 

Vahid Hashemian commented on KAFKA-2630:


Any update [~singhashish]?

> Add Namespaces to Kafka
> ---
>
> Key: KAFKA-2630
> URL: https://issues.apache.org/jira/browse/KAFKA-2630
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>
> Apache Kafka is rapidly finding its place in data heavy organizations as a 
> fault-tolerant message bus. One of the goals of Kafka is data integration, 
> which makes it important to support many users in one Kafka system. With 
> increasing adoption and user community, support for multi-tenancy is becoming 
> a popular demand. There have been a few discussions on Apache Kafka’s mailing 
> lists regarding the same, indicating importance of the feature. Namespaces 
> will allow/ enable many functionalities that require logical grouping of 
> topics. If you think topic as a SQL table, then namespace is a SQL database 
> that lets you group tables together.
> [KIP-37|https://cwiki.apache.org/confluence/display/KAFKA/KIP-37+-+Add+Namespaces+to+Kafka]
>  covers the details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3929) Add prefix for underlying clients configs in StreamConfig

2016-08-02 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3929:
-
   Resolution: Fixed
Fix Version/s: 0.10.1.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 1649
[https://github.com/apache/kafka/pull/1649]

> Add prefix for underlying clients configs in StreamConfig
> -
>
> Key: KAFKA-3929
> URL: https://issues.apache.org/jira/browse/KAFKA-3929
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Damian Guy
>  Labels: api
> Fix For: 0.10.1.0
>
>
> There are a couple of configs that have the same name for producer / consumer 
> configs, e.g. take a look at {{CommonClientConfigs}}, and also for producer / 
> consumer interceptors there are commonly named configs as well.
> This is semi-related to KAFKA-3740 since we need to add "sub-class" configs 
> for RocksDB as well, and we'd better have some prefix mechanism for such 
> hierarchical configs in general.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1696: KAFKA-4010; ConfigDef.toRst() to have grouped sect...

2016-08-02 Thread rekhajoshm
GitHub user rekhajoshm opened a pull request:

https://github.com/apache/kafka/pull/1696

KAFKA-4010; ConfigDef.toRst() to have grouped sections with dependents info

- Added sort method with group order
- Added dependents info 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rekhajoshm/kafka KAFKA-4010

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1696.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1696


commit c9a66992b1095616f87c5748f210b973ebc7eb01
Author: Rekha Joshi 
Date:   2016-05-26T17:48:37Z

Merge pull request #2 from apache/trunk

Apache Kafka trunk pull

commit 8d7fb005cb132440e7768a5b74257d2598642e0f
Author: Rekha Joshi 
Date:   2016-05-30T21:37:43Z

Merge pull request #3 from apache/trunk

Apache Kafka trunk pull

commit fbef9a8fb1411282fbadec46955691c3e7ba2578
Author: Rekha Joshi 
Date:   2016-06-04T23:58:02Z

Merge pull request #4 from apache/trunk

Apache Kafka trunk pull

commit 172db701bf9affda1304b684921260d1cd36ae9e
Author: Rekha Joshi 
Date:   2016-06-06T22:10:31Z

Merge pull request #6 from apache/trunk

Apache Kafka trunk pull

commit 9d18d93745cf2bc9b0ab4bb9b25d9a31196ef918
Author: Rekha Joshi 
Date:   2016-06-07T19:36:45Z

Merge pull request #7 from apache/trunk

Apache trunk pull

commit 882faea01f28aef1977f4ced6567833bcf736840
Author: Rekha Joshi 
Date:   2016-06-13T20:01:43Z

Merge pull request #8 from confluentinc/trunk

Apache kafka trunk pull

commit 851315d39c0c308d79b9575546822aa932c46a09
Author: Rekha Joshi 
Date:   2016-06-27T17:34:54Z

Merge pull request #9 from apache/trunk

Merge Apache kafka trunk

commit 613f07c2b4193302c82a5d6eaa1e53e4b87bfbc1
Author: Rekha Joshi 
Date:   2016-07-09T17:03:45Z

Merge pull request #11 from apache/trunk

Merge Apache kafka trunk

commit 150e46e462cc192fb869e633f6d9ab681e7b83f9
Author: Rekha Joshi 
Date:   2016-08-02T19:44:09Z

Merge pull request #12 from apache/trunk

Apache Kafka trunk pull

commit 00e70da51f03b23877e4e23e7e1500a1c9d7d20a
Author: Joshi 
Date:   2016-08-02T21:17:17Z

KAFKA-4010: ConfigDef.toRst() to have grouped sections with dependents info




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4013) SaslServerCallbackHandler should include cause for exception

2016-08-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404754#comment-15404754
 ] 

ASF GitHub Bot commented on KAFKA-4013:
---

GitHub user bbaugher opened a pull request:

https://github.com/apache/kafka/pull/1695

KAFKA-4013: Included exception cause in SaslServerCallbackHandler



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bbaugher/kafka KAFKA-4013

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1695.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1695


commit 0e72dcb6d2a20095f03e659c600aabeab1c52c88
Author: Bryan Baugher 
Date:   2016-08-02T20:43:04Z

KAFKA-4013: Included exception cause in SaslServerCallbackHandler




> SaslServerCallbackHandler should include cause for exception
> 
>
> Key: KAFKA-4013
> URL: https://issues.apache.org/jira/browse/KAFKA-4013
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Bryan Baugher
>
> SaslServerCallbackHandler can throw an exception setting the authorized id 
> for a user[1] but will not include the cause which makes it hard to debug
> [1] - 
> https://github.com/apache/kafka/blob/0.10.0.0/clients/src/main/java/org/apache/kafka/common/security/authenticator/SaslServerCallbackHandler.java#L92



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1695: KAFKA-4013: Included exception cause in SaslServer...

2016-08-02 Thread bbaugher
GitHub user bbaugher opened a pull request:

https://github.com/apache/kafka/pull/1695

KAFKA-4013: Included exception cause in SaslServerCallbackHandler



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bbaugher/kafka KAFKA-4013

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1695.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1695


commit 0e72dcb6d2a20095f03e659c600aabeab1c52c88
Author: Bryan Baugher 
Date:   2016-08-02T20:43:04Z

KAFKA-4013: Included exception cause in SaslServerCallbackHandler




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-4013) SaslServerCallbackHandler should include cause for exception

2016-08-02 Thread Bryan Baugher (JIRA)
Bryan Baugher created KAFKA-4013:


 Summary: SaslServerCallbackHandler should include cause for 
exception
 Key: KAFKA-4013
 URL: https://issues.apache.org/jira/browse/KAFKA-4013
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Bryan Baugher


SaslServerCallbackHandler can throw an exception setting the authorized id for 
a user[1] but will not include the cause which makes it hard to debug

[1] - 
https://github.com/apache/kafka/blob/0.10.0.0/clients/src/main/java/org/apache/kafka/common/security/authenticator/SaslServerCallbackHandler.java#L92



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4012) KerberosShortNamer should implement toString()

2016-08-02 Thread Bryan Baugher (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404741#comment-15404741
 ] 

Bryan Baugher commented on KAFKA-4012:
--

Pull request, https://github.com/apache/kafka/pull/1694

> KerberosShortNamer should implement toString()
> --
>
> Key: KAFKA-4012
> URL: https://issues.apache.org/jira/browse/KAFKA-4012
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Bryan Baugher
>
> KerberosShortNamer will throw an exception where it uses toString()[1] but 
> its not implemented so it doesn't provide much value
> [1] - 
> https://github.com/apache/kafka/blob/0.10.0.0/clients/src/main/java/org/apache/kafka/common/security/kerberos/KerberosShortNamer.java#L98



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4012) KerberosShortNamer should implement toString()

2016-08-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404738#comment-15404738
 ] 

ASF GitHub Bot commented on KAFKA-4012:
---

GitHub user bbaugher opened a pull request:

https://github.com/apache/kafka/pull/1694

KAFKA-4012: Added #toString() to KerberosShortNamer



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bbaugher/kafka KAFKA-4012

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1694.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1694


commit 85219cb3115d9b596ee89820a84b80ee59d894d9
Author: Bryan Baugher 
Date:   2016-08-02T20:37:26Z

KAFKA-4012: Added #toString() to KerberosShortNamer




> KerberosShortNamer should implement toString()
> --
>
> Key: KAFKA-4012
> URL: https://issues.apache.org/jira/browse/KAFKA-4012
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Bryan Baugher
>
> KerberosShortNamer will throw an exception where it uses toString()[1] but 
> its not implemented so it doesn't provide much value
> [1] - 
> https://github.com/apache/kafka/blob/0.10.0.0/clients/src/main/java/org/apache/kafka/common/security/kerberos/KerberosShortNamer.java#L98



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1694: KAFKA-4012: Added #toString() to KerberosShortName...

2016-08-02 Thread bbaugher
GitHub user bbaugher opened a pull request:

https://github.com/apache/kafka/pull/1694

KAFKA-4012: Added #toString() to KerberosShortNamer



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bbaugher/kafka KAFKA-4012

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1694.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1694


commit 85219cb3115d9b596ee89820a84b80ee59d894d9
Author: Bryan Baugher 
Date:   2016-08-02T20:37:26Z

KAFKA-4012: Added #toString() to KerberosShortNamer




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-4012) KerberosShortNamer should implement toString()

2016-08-02 Thread Bryan Baugher (JIRA)
Bryan Baugher created KAFKA-4012:


 Summary: KerberosShortNamer should implement toString()
 Key: KAFKA-4012
 URL: https://issues.apache.org/jira/browse/KAFKA-4012
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Bryan Baugher


KerberosShortNamer will throw an exception where it uses toString()[1] but its 
not implemented so it doesn't provide much value

[1] - 
https://github.com/apache/kafka/blob/0.10.0.0/clients/src/main/java/org/apache/kafka/common/security/kerberos/KerberosShortNamer.java#L98



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-4010) ConfigDef.toRst() should create sections for each group

2016-08-02 Thread Rekha Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rekha Joshi reassigned KAFKA-4010:
--

Assignee: Rekha Joshi

> ConfigDef.toRst() should create sections for each group
> ---
>
> Key: KAFKA-4010
> URL: https://issues.apache.org/jira/browse/KAFKA-4010
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Shikhar Bhushan
>Assignee: Rekha Joshi
>Priority: Minor
>
> Currently the ordering seems a bit arbitrary. There is a logical grouping 
> that connectors are now able to specify with the 'group' field, which we 
> should use as section headers. Also it would be good to generate {{:ref:}} 
> for each section.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-54 Sticky Partition Assignment Strategy

2016-08-02 Thread Vahid S Hashemian
I would like to revive this thread and ask for additional feedback on this 
KIP.

There have already been some feedback, mostly in favor, plus some concern 
about the value gain considering the complexity and the semantics; i.e. 
how the eventually revoked assignments need to be processed in the 
onPartitionsAssigned() callback, and not in onPartitionsRevoked().

If it helps, I could also send a note to users mailing list about this KIP 
and ask for their feedback.
I could also put the KIP up for a vote if that is expected at this point.

Thanks.
--Vahid




[jira] [Commented] (KAFKA-3042) updateIsr should stop after failed several times due to zkVersion issue

2016-08-02 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404545#comment-15404545
 ] 

Onur Karaman commented on KAFKA-3042:
-

I'm still trying to understand the downsides to getting rid of 
LeaderAndIsrRequest. Let's say we got rid of it. From a debugging standpoint, 
is there any information from the request log or mbeans that would no longer 
exist or be harder to figure out? If such a thing exists, maybe we can just 
augment the UpdateMetadataRequest or add an mbean to fill in that gap.

> updateIsr should stop after failed several times due to zkVersion issue
> ---
>
> Key: KAFKA-3042
> URL: https://issues.apache.org/jira/browse/KAFKA-3042
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
> Environment: jdk 1.7
> centos 6.4
>Reporter: Jiahongchao
> Fix For: 0.10.1.0
>
> Attachments: controller.log, server.log.2016-03-23-01, 
> state-change.log
>
>
> sometimes one broker may repeatly log
> "Cached zkVersion 54 not equal to that in zookeeper, skip updating ISR"
> I think this is because the broker consider itself as the leader in fact it's 
> a follower.
> So after several failed tries, it need to find out who is the leader



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3752) Provide a way for KStreams to recover from unclean shutdown

2016-08-02 Thread Roger Hoover (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404439#comment-15404439
 ] 

Roger Hoover commented on KAFKA-3752:
-

[~guozhang] You're assessment seems correct.  It happened again when I 
restarted after a clean shutdown (SIGTERM + wait for exit). 

1.  We have a single KafkaStreams instance with 8 threads.
2. Here's the full log:  
https://gist.github.com/theduderog/f9ab4767cd3b098d404f5513a7e1c27e

> Provide a way for KStreams to recover from unclean shutdown
> ---
>
> Key: KAFKA-3752
> URL: https://issues.apache.org/jira/browse/KAFKA-3752
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Roger Hoover
>  Labels: architecture
>
> If a KStream application gets killed with SIGKILL (e.g. by the Linux OOM 
> Killer), it may leave behind lock files and fail to recover.
> It would be useful to have an options (say --force) to tell KStreams to 
> proceed even if it finds old LOCK files.
> {noformat}
> [2016-05-24 17:37:52,886] ERROR Failed to create an active task #0_0 in 
> thread [StreamThread-1]:  
> (org.apache.kafka.streams.processor.internals.StreamThread:583)
> org.apache.kafka.streams.errors.ProcessorStateException: Error while creating 
> the state manager
>   at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:71)
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:86)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:550)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:577)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$000(StreamThread.java:68)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:123)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:222)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$1.onSuccess(AbstractCoordinator.java:232)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$1.onSuccess(AbstractCoordinator.java:227)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture$2.onSuccess(RequestFuture.java:182)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:436)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:422)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:679)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:426)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:278)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:243)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.ensurePartitionAssignment(ConsumerCoordinator.java:345)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:977)
>   at 
> 

[jira] [Commented] (KAFKA-3420) Transient failure in OffsetCommitTest.testNonExistingTopicOffsetCommit

2016-08-02 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404408#comment-15404408
 ] 

Jason Gustafson commented on KAFKA-3420:


I investigated this briefly, but could not reproduce the failure. It actually 
makes sense that the error is returned for the existing topic since all of the 
non-existing topics are filtered before being passed to {{GroupCoordinator}} 
where we check that the request is for the right coordinator. The 
NOT_COORDINATOR_FOR_GROUP error suggests that the the __consumer_offsets 
partition corresponding to the group was moved to another broker, but as far as 
I can tell, there is only one broker in this test case, so I'm not too sure 
what's going on. However, I think it's unlikely to be related to KAFKA-2068 as 
I initially suggested. 

> Transient failure in OffsetCommitTest.testNonExistingTopicOffsetCommit
> --
>
> Key: KAFKA-3420
> URL: https://issues.apache.org/jira/browse/KAFKA-3420
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>  Labels: transient-unit-test-failure
>
> From a recent build. Possibly related to KAFKA-2068, which was committed 
> recently.
> {code}
> java.lang.AssertionError: expected:<0> but was:<16>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> kafka.server.OffsetCommitTest.testNonExistingTopicOffsetCommit(OffsetCommitTest.scala:308)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:105)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:56)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:64)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:49)
>   at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.messaging.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:106)
>   at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> 

[jira] [Commented] (KAFKA-1211) Hold the produce request with ack > 1 in purgatory until replicas' HW has larger than the produce offset

2016-08-02 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404404#comment-15404404
 ] 

Jun Rao commented on KAFKA-1211:


[~fpj], for #1 and #2, there are a couple scenarios that this proposal can fix.
a. The first one is what's described in the original jira. Currently, when the 
follower does truncation, it can truncate some previously committed messages. 
If the follower immediately becomes the leader after truncation, we will lose 
some previously committed messages. This is rare, but if it happens, it's bad. 
The proposal fixes this case by preventing the follower from unnecessarily 
truncating previously committed messages.
b. Another issue is that a portion of the log in different replicas may not 
match in certain failure cases. This can happen when unclean leader election is 
enabled. However, even if unclean leader election is disabled, mis-matching can 
still happen when messages are lost due to power outage (see KAFKA-3919). The 
proposal fixes this issue by making sure that the replicas are always identical.

For #3, the controller increases the leader generation every time the leader 
changes. The latest leader generation is persisted in ZK.

For #4, putting the leader generation in the segment file name is another 
possibility. One concern I had on that approach is dealing with compacted 
topics. After compaction, it's possible there is only a small number (or even 
just a single) messages left in a particular generation. Putting the generation 
id in the segment file name will force us to have tiny segments, which is not 
ideal. About the race condition, even with a separate checkpoint file, we can 
avoid that. The sequencing will be (1) broker receives LeaderAndIsrRequest to 
become leader; (2) broker stops fetching from current leader; (3) no new writes 
can happen to this replica at this point; (4) broker writes the new leader 
generation and log end offset to checkpoint file; (5) broker marks replica as 
leader; (6) new writes can happen to this replica now.

For #5, it depends on who becomes the new leader in that case. If A becomes the 
new leader (generation 3), then B and C will remove m1 and m2 and copy m3 and 
m4 over from A. If B becomes the new leader, A will remove m3 and m4 and copy 
m1 and m2 over from B. In either case, the replicas will be identical.

> Hold the produce request with ack > 1 in purgatory until replicas' HW has 
> larger than the produce offset
> 
>
> Key: KAFKA-1211
> URL: https://issues.apache.org/jira/browse/KAFKA-1211
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.11.0.0
>
>
> Today during leader failover we will have a weakness period when the 
> followers truncate their data before fetching from the new leader, i.e., 
> number of in-sync replicas is just 1. If during this time the leader has also 
> failed then produce requests with ack >1 that have get responded will still 
> be lost. To avoid this scenario we would prefer to hold the produce request 
> in purgatory until replica's HW has larger than the offset instead of just 
> their end-of-log offsets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3042) updateIsr should stop after failed several times due to zkVersion issue

2016-08-02 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404314#comment-15404314
 ] 

Jun Rao commented on KAFKA-3042:


Thinking about this a bit more. An alternative approach is to extend 
LeaderAndIsrRequest to include all endpoints for live_leaders. The follower can 
then obtain the right host/port for the leader from 
LeaderAndIsrRequest.live_leaders, instead of the metadata cache. This approach 
is more general and won't be depending on the ordering of UpdateMetadataRequest 
and LeaderAndIsrRequest. Since this is a protocol change, we will need to do a 
KIP. With this change, LeaderAndIsrRequest and UpdateMetadataRequest will look 
almost identical. The later still has an extra field for rack per broker. We 
can look into combining the two requests in the future.

> updateIsr should stop after failed several times due to zkVersion issue
> ---
>
> Key: KAFKA-3042
> URL: https://issues.apache.org/jira/browse/KAFKA-3042
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
> Environment: jdk 1.7
> centos 6.4
>Reporter: Jiahongchao
> Fix For: 0.10.1.0
>
> Attachments: controller.log, server.log.2016-03-23-01, 
> state-change.log
>
>
> sometimes one broker may repeatly log
> "Cached zkVersion 54 not equal to that in zookeeper, skip updating ISR"
> I think this is because the broker consider itself as the leader in fact it's 
> a follower.
> So after several failed tries, it need to find out who is the leader



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1211) Hold the produce request with ack > 1 in purgatory until replicas' HW has larger than the produce offset

2016-08-02 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404245#comment-15404245
 ] 

Flavio Junqueira edited comment on KAFKA-1211 at 8/2/16 4:14 PM:
-

[~junrao] let me ask a few clarification questions.

# Is it right that the scenarios described here do not affect the cases in 
which min isr > 1 and unclean leader election is disabled? If min isr is 
greater than 1 and the leader is always coming from the latest isr, then the 
leader can either truncate the followers or have them fetch the missing log 
suffix.
# The main goal of the proposal is to have replicas in a lossy configuration 
(e.g. min isr = 1, unclean leader election enabled) a leader and a follower 
converging to a common prefix by choosing an offset based on a common 
generation. The chosen generation is the largest generation in common between 
the two replicas. Is it right?
# How do we guarantee that the generation id is unique, by using zookeeper 
versions?
# I think there is a potential race between updating the 
leader-generation-checkpoint file and appending the first message of the 
generation. We might be better off rolling the log segment file and having the 
generation being part of the log segment file name. This way when we start a 
new generation, we also start a new file and we know precisely when a message 
from that generation has been appended.
# Let's consider a scenario with 3 servers A B C. I'm again assuming that it is 
ok to have a single server up to ack requests. Say we have the following 
execution:

||Generation||A||B||C||
|1| |m1|m1|
| | |m2|m2|
|2|m3| | |
| |m4| | |

Say that now A and B start generation 3. They have no generation in common, so 
they start from zero, dropping m1 and m2. Is that right? If later on C joins A 
and B, then it will also drop m1 and m2, right? Given that the configuration is 
lossy, it doesn't wrong to do it as all we are trying to do is to converge to a 
consistent state. 


was (Author: fpj):
[~junrao] let me ask a few clarification questions.

# Is it right that the scenarios described here do not affect the cases in 
which min isr > 1 and unclean leader election is disabled? If min isr is 
greater than 1 and the leader is always coming from the latest isr, then the 
leader can either truncate the followers or have them fetch the missing log 
suffix.
# The main goal of the proposal is to have replicas in a lossy configuration 
(e.g. min isr = 1, unclean leader election enabled) a leader and a follower 
converging to a common prefix by choosing an offset based on a common 
generation. The chosen generation is the largest generation in common between 
the two replicas. Is it right?
# How do we guarantee that the generation id is unique, by using zookeeper 
versions?
# I think there is a potential race between updating the 
leader-generation-checkpoint file and appending the first message of the 
generation. We might be better off rolling the log segment file and having the 
generation being part of the log segment file name. This way when we start a 
new generation, we also start a new file and we know precisely when a message 
from that generation has been appended.
# Let's consider a scenario with 3 servers A B C. I'm again assuming that it is 
ok to have a single server up to ack requests. Say we have the following 
execution:

{noformat}
Generation AB C
1  m1   
   m1
m2  
m2
2m3
  m4
{noformat}

Say that now A and B start generation 3. They have no generation in common, so 
the start from zero, dropping m1 and m2. Is that right? If later on C joins A 
and B, then it will also drop m1 and m2, right? Given that the configuration is 
lossy, it doesn't wrong to do it as all we are trying to do is to converge to a 
consistent state. 

> Hold the produce request with ack > 1 in purgatory until replicas' HW has 
> larger than the produce offset
> 
>
> Key: KAFKA-1211
> URL: https://issues.apache.org/jira/browse/KAFKA-1211
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.11.0.0
>
>
> Today during leader failover we will have a weakness period when the 
> followers truncate their data before fetching from the new leader, i.e., 
> number of in-sync replicas is just 1. If during this time the leader has also 
> failed then produce requests with ack >1 that have get responded will still 
> be lost. To avoid this scenario we would prefer to 

[jira] [Commented] (KAFKA-1211) Hold the produce request with ack > 1 in purgatory until replicas' HW has larger than the produce offset

2016-08-02 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404245#comment-15404245
 ] 

Flavio Junqueira commented on KAFKA-1211:
-

[~junrao] let me ask a few clarification questions.

# Is it right that the scenarios described here do not affect the cases in 
which min isr > 1 and unclean leader election is disabled? If min isr is 
greater than 1 and the leader is always coming from the latest isr, then the 
leader can either truncate the followers or have them fetch the missing log 
suffix.
# The main goal of the proposal is to have replicas in a lossy configuration 
(e.g. min isr = 1, unclean leader election enabled) a leader and a follower 
converging to a common prefix by choosing an offset based on a common 
generation. The chosen generation is the largest generation in common between 
the two replicas. Is it right?
# How do we guarantee that the generation id is unique, by using zookeeper 
versions?
# I think there is a potential race between updating the 
leader-generation-checkpoint file and appending the first message of the 
generation. We might be better off rolling the log segment file and having the 
generation being part of the log segment file name. This way when we start a 
new generation, we also start a new file and we know precisely when a message 
from that generation has been appended.
# Let's consider a scenario with 3 servers A B C. I'm again assuming that it is 
ok to have a single server up to ack requests. Say we have the following 
execution:

{noformat}
Generation AB C
1  m1   
   m1
m2  
m2
2m3
  m4
{noformat}

Say that now A and B start generation 3. They have no generation in common, so 
the start from zero, dropping m1 and m2. Is that right? If later on C joins A 
and B, then it will also drop m1 and m2, right? Given that the configuration is 
lossy, it doesn't wrong to do it as all we are trying to do is to converge to a 
consistent state. 

> Hold the produce request with ack > 1 in purgatory until replicas' HW has 
> larger than the produce offset
> 
>
> Key: KAFKA-1211
> URL: https://issues.apache.org/jira/browse/KAFKA-1211
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.11.0.0
>
>
> Today during leader failover we will have a weakness period when the 
> followers truncate their data before fetching from the new leader, i.e., 
> number of in-sync replicas is just 1. If during this time the leader has also 
> failed then produce requests with ack >1 that have get responded will still 
> be lost. To avoid this scenario we would prefer to hold the produce request 
> in purgatory until replica's HW has larger than the offset instead of just 
> their end-of-log offsets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2063) Bound fetch response size

2016-08-02 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404147#comment-15404147
 ] 

Jun Rao commented on KAFKA-2063:


For #3, the main reason for people to customize partition-level fetch size is 
to accommodate large messages. We can make the default max_reponse_bytes sth 
like 10MB, which is 10 times larger than default message size. If 
max_reponse_bytes >= partition-level fetch size, we just ignore the latter. 
Otherwise, we can error out and advise users to increase max_reponse_bytes if 
truly needed.

> Bound fetch response size
> -
>
> Key: KAFKA-2063
> URL: https://issues.apache.org/jira/browse/KAFKA-2063
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jay Kreps
>
> Currently the only bound on the fetch response size is 
> max.partition.fetch.bytes * num_partitions. There are two problems:
> 1. First this bound is often large. You may chose 
> max.partition.fetch.bytes=1MB to enable messages of up to 1MB. However if you 
> also need to consume 1k partitions this means you may receive a 1GB response 
> in the worst case!
> 2. The actual memory usage is unpredictable. Partition assignment changes, 
> and you only actually get the full fetch amount when you are behind and there 
> is a full chunk of data ready. This means an application that seems to work 
> fine will suddenly OOM when partitions shift or when the application falls 
> behind.
> We need to decouple the fetch response size from the number of partitions.
> The proposal for doing this would be to add a new field to the fetch request, 
> max_bytes which would control the maximum data bytes we would include in the 
> response.
> The implementation on the server side would grab data from each partition in 
> the fetch request until it hit this limit, then send back just the data for 
> the partitions that fit in the response. The implementation would need to 
> start from a random position in the list of topics included in the fetch 
> request to ensure that in a case of backlog we fairly balance between 
> partitions (to avoid first giving just the first partition until that is 
> exhausted, then the next partition, etc).
> This setting will make the max.partition.fetch.bytes field in the fetch 
> request much less useful and we  should discuss just getting rid of it.
> I believe this also solves the same thing we were trying to address in 
> KAFKA-598. The max_bytes setting now becomes the new limit that would need to 
> be compared to max_message size. This can be much larger--e.g. setting a 50MB 
> max_bytes setting would be okay, whereas now if you set 50MB you may need to 
> allocate 50MB*num_partitions.
> This will require evolving the fetch request protocol version to add the new 
> field and we should do a KIP for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[VOTE] 0.10.0.1 RC1

2016-08-02 Thread Ismael Juma
Hello Kafka users, developers and client-developers,

This is the second candidate for the release of Apache Kafka 0.10.0.1. This
is a bug fix release and it includes fixes and improvements from 52 JIRAs
(including a few critical bugs). See the release notes for more details:

http://home.apache.org/~ijuma/kafka-0.10.0.1-rc1/RELEASE_NOTES.html

When compared to RC0, RC1 contains fixes for two bugs (KAFKA-4008
and KAFKA-3950) and a couple of test stabilisation fixes.

*** Please download, test and vote by Friday, 5 August, 8am PT ***

Kafka's KEYS file containing PGP keys we use to sign the release:
http://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
http://home.apache.org/~ijuma/kafka-0.10.0.1-rc1/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging

* Javadoc:
http://home.apache.org/~ijuma/kafka-0.10.0.1-rc1/javadoc/

* Tag to be voted upon (off 0.10.0 branch) is the 0.10.0.1-rc1 tag:
https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=108580e4594d694827c953264969fe1ce2a7

* Documentation:
http://kafka.apache.org/0100/documentation.html

* Protocol:
http://kafka.apache.org/0100/protocol.html

* Successful Jenkins builds for the 0.10.0 branch:
Unit/integration tests: *https://builds.apache.org/job/kafka-0.10.0-jdk7/179/
*
System tests: *https://jenkins.confluent.io/job/system-test-kafka-0.10.0/136/
*

Thanks,
Ismael


Jenkins build is back to normal : kafka-trunk-jdk7 #1456

2016-08-02 Thread Apache Jenkins Server
See 



Build failed in Jenkins: kafka-trunk-jdk8 #794

2016-08-02 Thread Apache Jenkins Server
See 

--
Started by user ijuma
[EnvInject] - Loading node environment variables.
Building remotely on jenkins-test-24f (jenkins-cloud-4GB cloud-slave Ubuntu 
ubuntu jenkins-cloud-8GB) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/kafka.git # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/kafka.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/trunk^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/trunk^{commit} # timeout=10
Checking out Revision 3bb38d37b7a3fe2dc794c717df1d67e8f9a8af21 
(refs/remotes/origin/trunk)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3bb38d37b7a3fe2dc794c717df1d67e8f9a8af21
 > git rev-list 3bb38d37b7a3fe2dc794c717df1d67e8f9a8af21 # timeout=10
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson240601561665744225.sh
+ rm -rf 
+ 
/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2/bin/gradle

ERROR: JAVA_HOME is set to an invalid directory: 
/home/jenkins/tools/java/latest1.8

Please set the JAVA_HOME variable in your environment to match the
location of your Java installation.

Build step 'Execute shell' marked build as failure
Recording test results
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
ERROR: Step ‘Publish JUnit test result report’ failed: No test report files 
were found. Configuration error?
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2


Build failed in Jenkins: kafka-trunk-jdk8 #793

2016-08-02 Thread Apache Jenkins Server
See 

Changes:

[ismael] HOTFIX: Start embedded kafka in KafkaStreamsTest to avoid hanging

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on jenkins-test-24f (jenkins-cloud-4GB cloud-slave Ubuntu 
ubuntu jenkins-cloud-8GB) in workspace 

Cloning the remote Git repository
Cloning repository https://git-wip-us.apache.org/repos/asf/kafka.git
 > git init  # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/kafka.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/kafka.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # 
 > timeout=10
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/kafka.git # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/kafka.git
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/trunk^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/trunk^{commit} # timeout=10
Checking out Revision 3bb38d37b7a3fe2dc794c717df1d67e8f9a8af21 
(refs/remotes/origin/trunk)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3bb38d37b7a3fe2dc794c717df1d67e8f9a8af21
 > git rev-list f7976d2fc1793d0f635b42eb4dca3810e40c4cc8 # timeout=10
Unpacking http://services.gradle.org/distributions/gradle-2.4-rc-2-bin.zip to 
/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
 on jenkins-test-24f
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson4874562701962342490.sh
+ rm -rf 
+ 
/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2/bin/gradle

ERROR: JAVA_HOME is set to an invalid directory: 
/home/jenkins/tools/java/latest1.8

Please set the JAVA_HOME variable in your environment to match the
location of your Java installation.

Build step 'Execute shell' marked build as failure
Recording test results
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
ERROR: Step ‘Publish JUnit test result report’ failed: No test report files 
were found. Configuration error?
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2


[GitHub] kafka pull request #1693: HOTFIX: start embedded kafka in KafkaStreamsTest t...

2016-08-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1693


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : kafka-0.10.0-jdk7 #178

2016-08-02 Thread Apache Jenkins Server
See 



[GitHub] kafka pull request #1693: HOTFIX: start embedded kafka in KafkaStreamsTest t...

2016-08-02 Thread dguy
GitHub user dguy opened a pull request:

https://github.com/apache/kafka/pull/1693

HOTFIX: start embedded kafka in KafkaStreamsTest to avoid hanging

The KafkaStreamsTest can occasionally hang if the test doesn't run fast 
enough. This is due to there being no brokers available on the broker.urls 
provided to the StreamsConfig. The KafkaConsumer does a poll and blocks causing 
the test to never complete.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka kafka-streams-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1693.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1693


commit 44b9178d80091d998b39d89e3c352737c76eee25
Author: Damian Guy 
Date:   2016-08-02T11:24:20Z

start an EmbeddedKafkaCluster in KafkaStreamsTest to avoid hanging on 
KafkaConsumser.poll




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2063) Bound fetch response size

2016-08-02 Thread Andrey Neporada (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403787#comment-15403787
 ] 

Andrey Neporada commented on KAFKA-2063:


1. I will submit KIP.
2. Clear on that.
3. Yes, we will need new property on both broker (for ReplicaFetcherThread) and 
client levels. The question remains open is what should we do if client set up 
partition level property explicitly.

> Bound fetch response size
> -
>
> Key: KAFKA-2063
> URL: https://issues.apache.org/jira/browse/KAFKA-2063
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jay Kreps
>
> Currently the only bound on the fetch response size is 
> max.partition.fetch.bytes * num_partitions. There are two problems:
> 1. First this bound is often large. You may chose 
> max.partition.fetch.bytes=1MB to enable messages of up to 1MB. However if you 
> also need to consume 1k partitions this means you may receive a 1GB response 
> in the worst case!
> 2. The actual memory usage is unpredictable. Partition assignment changes, 
> and you only actually get the full fetch amount when you are behind and there 
> is a full chunk of data ready. This means an application that seems to work 
> fine will suddenly OOM when partitions shift or when the application falls 
> behind.
> We need to decouple the fetch response size from the number of partitions.
> The proposal for doing this would be to add a new field to the fetch request, 
> max_bytes which would control the maximum data bytes we would include in the 
> response.
> The implementation on the server side would grab data from each partition in 
> the fetch request until it hit this limit, then send back just the data for 
> the partitions that fit in the response. The implementation would need to 
> start from a random position in the list of topics included in the fetch 
> request to ensure that in a case of backlog we fairly balance between 
> partitions (to avoid first giving just the first partition until that is 
> exhausted, then the next partition, etc).
> This setting will make the max.partition.fetch.bytes field in the fetch 
> request much less useful and we  should discuss just getting rid of it.
> I believe this also solves the same thing we were trying to address in 
> KAFKA-598. The max_bytes setting now becomes the new limit that would need to 
> be compared to max_message size. This can be much larger--e.g. setting a 50MB 
> max_bytes setting would be okay, whereas now if you set 50MB you may need to 
> allocate 50MB*num_partitions.
> This will require evolving the fetch request protocol version to add the new 
> field and we should do a KIP for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2063) Bound fetch response size

2016-08-02 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403700#comment-15403700
 ] 

Ismael Juma commented on KAFKA-2063:


[~nepal], I believe the answer is yes to 1) and 2). A few points (you may be 
aware of them already, but just want to be clear):

1. A change like this needs to follow the KIP process. Would you be willing to 
submit a KIP for discussion?

2. Any change we make must take compatibility into account. So, even though we 
want the new version of the fetch request not to have the partition level 
limit, the broker will still have to support the partition level limit for the 
older versions of the fetch request. This is true for both when it receives the 
fetch request in KafkaApis as well as when it sends the fetch request in 
ReplicaFetcherThread. In the latter case, we will have to use the 
inter.broker.protocol.version property to figure out which version of the fetch 
request to send.

3. It makes sense to have a new property for the new request level limit. I 
think we need it on the broker and client as well, right? Also, if we deprecate 
the partition-level properties, we need to think through the consequences for 
users who have custom values for these properties and if we can mitigate these 
potential issues. This is always important, but it's of particular importance 
if we want to include this in a minor release (e.g. 0.10.1.0).

> Bound fetch response size
> -
>
> Key: KAFKA-2063
> URL: https://issues.apache.org/jira/browse/KAFKA-2063
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jay Kreps
>
> Currently the only bound on the fetch response size is 
> max.partition.fetch.bytes * num_partitions. There are two problems:
> 1. First this bound is often large. You may chose 
> max.partition.fetch.bytes=1MB to enable messages of up to 1MB. However if you 
> also need to consume 1k partitions this means you may receive a 1GB response 
> in the worst case!
> 2. The actual memory usage is unpredictable. Partition assignment changes, 
> and you only actually get the full fetch amount when you are behind and there 
> is a full chunk of data ready. This means an application that seems to work 
> fine will suddenly OOM when partitions shift or when the application falls 
> behind.
> We need to decouple the fetch response size from the number of partitions.
> The proposal for doing this would be to add a new field to the fetch request, 
> max_bytes which would control the maximum data bytes we would include in the 
> response.
> The implementation on the server side would grab data from each partition in 
> the fetch request until it hit this limit, then send back just the data for 
> the partitions that fit in the response. The implementation would need to 
> start from a random position in the list of topics included in the fetch 
> request to ensure that in a case of backlog we fairly balance between 
> partitions (to avoid first giving just the first partition until that is 
> exhausted, then the next partition, etc).
> This setting will make the max.partition.fetch.bytes field in the fetch 
> request much less useful and we  should discuss just getting rid of it.
> I believe this also solves the same thing we were trying to address in 
> KAFKA-598. The max_bytes setting now becomes the new limit that would need to 
> be compared to max_message size. This can be much larger--e.g. setting a 50MB 
> max_bytes setting would be okay, whereas now if you set 50MB you may need to 
> allocate 50MB*num_partitions.
> This will require evolving the fetch request protocol version to add the new 
> field and we should do a KIP for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-0.10.0-jdk7 #177

2016-08-02 Thread Apache Jenkins Server
See 

--
Started by user ijuma
[EnvInject] - Loading node environment variables.
Building remotely on H10 (docker Ubuntu ubuntu yahoo-not-h2) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/kafka.git # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/kafka.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/0.10.0^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/0.10.0^{commit} # timeout=10
Checking out Revision f2405a73ea2dd4b636832b7f8729fb06a04de1d5 
(refs/remotes/origin/0.10.0)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f f2405a73ea2dd4b636832b7f8729fb06a04de1d5
 > git rev-list f2405a73ea2dd4b636832b7f8729fb06a04de1d5 # timeout=10
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
[kafka-0.10.0-jdk7] $ /bin/bash -xe /tmp/hudson7767009767832132156.sh
+ rm -rf 
+ 
/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2/bin/gradle
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
http://gradle.org/docs/2.4-rc-2/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:downloadWrapper

BUILD SUCCESSFUL

Total time: 16.797 secs
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
[kafka-0.10.0-jdk7] $ /bin/bash -xe /tmp/hudson2094953924346521551.sh
+ export GRADLE_OPTS=-Xmx1024m -XX:MaxPermSize=256m
+ GRADLE_OPTS=-Xmx1024m
/tmp/hudson2094953924346521551.sh: line 2: export: `-XX:MaxPermSize=256m': not 
a valid identifier
Build step 'Execute shell' marked build as failure
Recording test results
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
ERROR: Step ?Publish JUnit test result report? failed: Test reports were found 
but none of them are new. Did tests run? 
For example, 

 is 4 hr 25 min old

Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2


[jira] [Commented] (KAFKA-2063) Bound fetch response size

2016-08-02 Thread Andrey Neporada (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403676#comment-15403676
 ] 

Andrey Neporada commented on KAFKA-2063:


(a) I refer to some new server side setting - something like 
fetch.partition.max.bytes (?). Broker setting replica.fetch.max.bytes should be 
deprecated along with consumer settings fetch.message.max.bytes and  
max.partition.fetch.bytes.

(b) Maybe I am running ahead too much here. In context of this ticket, yes, the 
only goal of reordering is to make progress and enforce fairness. And this all 
can be done on client side. 

(c) I mean to make fetch request deterministic on server side - fetch responses 
will go in order requested by client
(d) Yes, we should clearly document that clients who want to limit entire fetch 
response should also deploy some method to avoid starvation/unfairness - either 
random shuffling or round robin. Random shuffling seems to be easier to 
implement and IMHO it will work good enough for ReplicaFetcherThread.

In general, it looks like most people like to
1) retire partition level limit from fetch request
2) keep fetching order the same as the order of partitions in fetch request

Should I update PR? Any objections?



> Bound fetch response size
> -
>
> Key: KAFKA-2063
> URL: https://issues.apache.org/jira/browse/KAFKA-2063
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jay Kreps
>
> Currently the only bound on the fetch response size is 
> max.partition.fetch.bytes * num_partitions. There are two problems:
> 1. First this bound is often large. You may chose 
> max.partition.fetch.bytes=1MB to enable messages of up to 1MB. However if you 
> also need to consume 1k partitions this means you may receive a 1GB response 
> in the worst case!
> 2. The actual memory usage is unpredictable. Partition assignment changes, 
> and you only actually get the full fetch amount when you are behind and there 
> is a full chunk of data ready. This means an application that seems to work 
> fine will suddenly OOM when partitions shift or when the application falls 
> behind.
> We need to decouple the fetch response size from the number of partitions.
> The proposal for doing this would be to add a new field to the fetch request, 
> max_bytes which would control the maximum data bytes we would include in the 
> response.
> The implementation on the server side would grab data from each partition in 
> the fetch request until it hit this limit, then send back just the data for 
> the partitions that fit in the response. The implementation would need to 
> start from a random position in the list of topics included in the fetch 
> request to ensure that in a case of backlog we fairly balance between 
> partitions (to avoid first giving just the first partition until that is 
> exhausted, then the next partition, etc).
> This setting will make the max.partition.fetch.bytes field in the fetch 
> request much less useful and we  should discuss just getting rid of it.
> I believe this also solves the same thing we were trying to address in 
> KAFKA-598. The max_bytes setting now becomes the new limit that would need to 
> be compared to max_message size. This can be much larger--e.g. setting a 50MB 
> max_bytes setting would be okay, whereas now if you set 50MB you may need to 
> allocate 50MB*num_partitions.
> This will require evolving the fetch request protocol version to add the new 
> field and we should do a KIP for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-0.10.0-jdk7 #175

2016-08-02 Thread Apache Jenkins Server
See 

Changes:

[me] KAFKA-4008: Module "tools" should not be dependent on "core"

--
[...truncated 6390 lines...]

org.apache.kafka.connect.storage.OffsetStorageWriterTest > 
testCancelAfterAwaitFlush PASSED

org.apache.kafka.connect.storage.FileOffsetBackingStoreTest > testSaveRestore 
PASSED

org.apache.kafka.connect.storage.FileOffsetBackingStoreTest > testGetSet PASSED

org.apache.kafka.connect.util.KafkaBasedLogTest > testStartStop PASSED

org.apache.kafka.connect.util.KafkaBasedLogTest > testReloadOnStart PASSED

org.apache.kafka.connect.util.KafkaBasedLogTest > testProducerError PASSED

org.apache.kafka.connect.util.KafkaBasedLogTest > testSendAndReadToEnd PASSED

org.apache.kafka.connect.util.KafkaBasedLogTest > testConsumerError PASSED

org.apache.kafka.connect.util.ShutdownableThreadTest > testGracefulShutdown 
PASSED

org.apache.kafka.connect.util.ShutdownableThreadTest > testForcibleShutdown 
PASSED

org.apache.kafka.connect.util.TableTest > basicOperations PASSED

org.apache.kafka.connect.runtime.AbstractHerderTest > connectorStatus PASSED

org.apache.kafka.connect.runtime.AbstractHerderTest > taskStatus PASSED

org.apache.kafka.connect.runtime.WorkerTaskTest > stopBeforeStarting PASSED

org.apache.kafka.connect.runtime.WorkerTaskTest > standardStartup PASSED

org.apache.kafka.connect.runtime.WorkerTaskTest > cancelBeforeStopping PASSED

org.apache.kafka.connect.runtime.WorkerSinkTaskTest > testStartPaused PASSED

org.apache.kafka.connect.runtime.WorkerSinkTaskTest > testPause PASSED

org.apache.kafka.connect.runtime.WorkerSinkTaskTest > testPollRedelivery PASSED

org.apache.kafka.connect.runtime.WorkerSinkTaskTest > 
testErrorInRebalancePartitionRevocation PASSED

org.apache.kafka.connect.runtime.WorkerSinkTaskTest > 
testErrorInRebalancePartitionAssignment PASSED

org.apache.kafka.connect.runtime.WorkerSinkTaskTest > 
testWakeupInCommitSyncCausesRetry PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testStartPaused PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testPause PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testPollsInBackground 
PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testFailureInPoll PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testCommit PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testCommitFailure PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > 
testSendRecordsConvertsData PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testSendRecordsRetries 
PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > 
testSendRecordsTaskCommitRecordFail PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testSlowTaskStart PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testPutConnectorConfig PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartTask PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartUnknownTask PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartTaskRedirectToLeader PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartTaskRedirectToOwner PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorConfigAdded PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorConfigUpdate PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorPaused PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorResumed PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testUnknownConnectorPaused PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorPausedRunningTaskOnly PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorResumedRunningTaskOnly PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testTaskConfigAdded PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testJoinLeaderCatchUpFails PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testAccessors PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testJoinAssignment PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRebalance PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRebalanceFailedConnector PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testHaltCleansUpWorker PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnector PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testDestroyConnector PASSED


Build failed in Jenkins: kafka-trunk-jdk8 #792

2016-08-02 Thread Apache Jenkins Server
See 

Changes:

[me] KAFKA-4008: Module "tools" should not be dependent on "core"

--
[...truncated 5923 lines...]

kafka.api.SslEndToEndAuthorizationTest > testProduceConsumeViaSubscribe STARTED

kafka.api.SslEndToEndAuthorizationTest > testProduceConsumeViaSubscribe PASSED

kafka.api.QuotasTest > testProducerConsumerOverrideUnthrottled STARTED

kafka.api.QuotasTest > testProducerConsumerOverrideUnthrottled PASSED

kafka.api.QuotasTest > testThrottledProducerConsumer STARTED

kafka.api.QuotasTest > testThrottledProducerConsumer PASSED

kafka.api.ProducerBounceTest > testBrokerFailure STARTED

kafka.api.ProducerBounceTest > testBrokerFailure PASSED

kafka.api.test.ProducerCompressionTest > testCompression[0] STARTED

kafka.api.test.ProducerCompressionTest > testCompression[0] PASSED

kafka.api.test.ProducerCompressionTest > testCompression[1] STARTED

kafka.api.test.ProducerCompressionTest > testCompression[1] PASSED

kafka.api.test.ProducerCompressionTest > testCompression[2] STARTED

kafka.api.test.ProducerCompressionTest > testCompression[2] PASSED

kafka.api.test.ProducerCompressionTest > testCompression[3] STARTED

kafka.api.test.ProducerCompressionTest > testCompression[3] PASSED

kafka.api.SaslSslConsumerTest > testPauseStateNotPreservedByRebalance STARTED

kafka.api.SaslSslConsumerTest > testPauseStateNotPreservedByRebalance PASSED

kafka.api.SaslSslConsumerTest > testUnsubscribeTopic STARTED

kafka.api.SaslSslConsumerTest > testUnsubscribeTopic PASSED

kafka.api.SaslSslConsumerTest > testListTopics STARTED

kafka.api.SaslSslConsumerTest > testListTopics PASSED

kafka.api.SaslSslConsumerTest > testAutoCommitOnRebalance STARTED

kafka.api.SaslSslConsumerTest > testAutoCommitOnRebalance PASSED

kafka.api.SaslSslConsumerTest > testSimpleConsumption STARTED

kafka.api.SaslSslConsumerTest > testSimpleConsumption PASSED

kafka.api.SaslSslConsumerTest > testPartitionReassignmentCallback STARTED

kafka.api.SaslSslConsumerTest > testPartitionReassignmentCallback PASSED

kafka.api.SaslSslConsumerTest > testCommitSpecifiedOffsets STARTED

kafka.api.SaslSslConsumerTest > testCommitSpecifiedOffsets PASSED

kafka.api.ApiUtilsTest > testShortStringNonASCII STARTED

kafka.api.ApiUtilsTest > testShortStringNonASCII PASSED

kafka.api.ApiUtilsTest > testShortStringASCII STARTED

kafka.api.ApiUtilsTest > testShortStringASCII PASSED

kafka.api.PlaintextConsumerTest > testPartitionsForAutoCreate STARTED

kafka.api.PlaintextConsumerTest > testPartitionsForAutoCreate PASSED

kafka.api.PlaintextConsumerTest > testShrinkingTopicSubscriptions STARTED

kafka.api.PlaintextConsumerTest > testShrinkingTopicSubscriptions PASSED

kafka.api.PlaintextConsumerTest > testSubsequentPatternSubscription STARTED

kafka.api.PlaintextConsumerTest > testSubsequentPatternSubscription PASSED

kafka.api.PlaintextConsumerTest > testAsyncCommit STARTED

kafka.api.PlaintextConsumerTest > testAsyncCommit PASSED

kafka.api.PlaintextConsumerTest > testMultiConsumerSessionTimeoutOnStopPolling 
STARTED

kafka.api.PlaintextConsumerTest > testMultiConsumerSessionTimeoutOnStopPolling 
PASSED

kafka.api.PlaintextConsumerTest > testPartitionsForInvalidTopic STARTED

kafka.api.PlaintextConsumerTest > testPartitionsForInvalidTopic PASSED

kafka.api.PlaintextConsumerTest > testSeek STARTED

kafka.api.PlaintextConsumerTest > testSeek PASSED

kafka.api.PlaintextConsumerTest > testPositionAndCommit STARTED

kafka.api.PlaintextConsumerTest > testPositionAndCommit PASSED

kafka.api.PlaintextConsumerTest > testMultiConsumerSessionTimeoutOnClose STARTED

kafka.api.PlaintextConsumerTest > testMultiConsumerSessionTimeoutOnClose PASSED

kafka.api.PlaintextConsumerTest > testFetchRecordTooLarge STARTED

kafka.api.PlaintextConsumerTest > testFetchRecordTooLarge PASSED

kafka.api.PlaintextConsumerTest > testMultiConsumerDefaultAssignment STARTED

kafka.api.PlaintextConsumerTest > testMultiConsumerDefaultAssignment PASSED

kafka.api.PlaintextConsumerTest > testAutoCommitOnClose STARTED

kafka.api.PlaintextConsumerTest > testAutoCommitOnClose PASSED

kafka.api.PlaintextConsumerTest > testExpandingTopicSubscriptions STARTED

kafka.api.PlaintextConsumerTest > testExpandingTopicSubscriptions PASSED

kafka.api.PlaintextConsumerTest > testInterceptors STARTED

kafka.api.PlaintextConsumerTest > testInterceptors PASSED

kafka.api.PlaintextConsumerTest > testPatternUnsubscription STARTED

kafka.api.PlaintextConsumerTest > testPatternUnsubscription PASSED

kafka.api.PlaintextConsumerTest > testGroupConsumption STARTED

kafka.api.PlaintextConsumerTest > testGroupConsumption PASSED

kafka.api.PlaintextConsumerTest > testPartitionsFor STARTED

kafka.api.PlaintextConsumerTest > testPartitionsFor PASSED

kafka.api.PlaintextConsumerTest > testInterceptorsWithWrongKeyValue STARTED

kafka.api.PlaintextConsumerTest > testInterceptorsWithWrongKeyValue PASSED

kafka.api.PlaintextConsumerTest >