[jira] [Commented] (KAFKA-4915) Make logging consistent

2017-07-21 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16097077#comment-16097077
 ] 

Ewen Cheslack-Postava commented on KAFKA-4915:
--

[~dernasherbrezon] No, this is actually great, I just wasn't seeing it! Unless 
you have structured logging output that makes it easy to filter this stuff, 
messages like that can definitely be annoying noise.

I just did a quick grep for possible similar issues:

{code}
git grep log[.][^e] | grep -i error
{code}

and although some of the results are false positives, there are quite a few 
warn and debug level messages that contain the word error. A broader patch to 
clean these up would be great. Any interest in submitting a PR? I'd be happy to 
get it committed.

> Make logging consistent
> ---
>
> Key: KAFKA-4915
> URL: https://issues.apache.org/jira/browse/KAFKA-4915
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.1
>Reporter: Andrey
>
> Currently we see in logs:
> {code}
> WARN Error while fetching metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> Expected:
> {code}
> WARN unable to fetch metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> or
> {code}
> ERROR unable to fetch metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> See for reference: 
> https://github.com/apache/kafka/blob/65650ba4dcba8a9729cb9cb6477a62a7b7c3714e/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java#L723



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-4915) Don't include "error" in log messages at non-error levels

2017-07-21 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4915:
-
Summary: Don't include "error" in log messages at non-error levels  (was: 
Make logging consistent)

> Don't include "error" in log messages at non-error levels
> -
>
> Key: KAFKA-4915
> URL: https://issues.apache.org/jira/browse/KAFKA-4915
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.1
>Reporter: Andrey
>
> Currently we see in logs:
> {code}
> WARN Error while fetching metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> Expected:
> {code}
> WARN unable to fetch metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> or
> {code}
> ERROR unable to fetch metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> See for reference: 
> https://github.com/apache/kafka/blob/65650ba4dcba8a9729cb9cb6477a62a7b7c3714e/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java#L723



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-5611) One or more consumers in a consumer-group stop consuming after rebalancing

2017-07-21 Thread Panos Skianis (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16097048#comment-16097048
 ] 

Panos Skianis edited comment on KAFKA-5611 at 7/22/17 1:18 AM:
---

Somehow managed to recreate this on a different environment but stll cannot do 
it at will. I am attaching a filtered log from the "consumer" app that had the 
issue. 
Debugging was switched on for the consumer internals package of the kafka 
clients (o.a.k.c.consumer.internals). 

We have introduced a way to force the kafka consumer code to be "restarted" 
manually. At the moment, on a cluster of 4 kafkas/3 zookeepers  and 3 client 
consumer apps, we have been trying to recreate this problem by causing 
rebalancing on purpose (group.id = salesforce-group-gb). You may notice other 
groups as well since this app is using multiple groups but those are ok at the 
moment afaik. 

File related to this comment is attached as bad-server-with-more-logging-1 
(tarballed and gzipped). Let me know if you want me to provide it in another 
format.
There are some comments inside that file (starting normally with the word 
COMMENT). 

Note: the app is using akka-kafka-streams and I believe there is a timeout 
involved. You will be able to see this if you search for WakeupException.
I am wondering if that is causing the issue because there is delay in getting 
the information back from kafka (timeout current set at 3s). I will go looking 
down that route in case this is just a long network call and the timeout is 
"short".





was (Author: pskianis):
Somehow managed to recreate this on a different environment but stll cannot do 
it at will. I am attaching a filtered log from the "consumer" app that had the 
issue. 
Debugging was switched on for the consumer internals package of the kafka 
clients (o.a.k.c.consumer.internals). 

We have introduced a way to force the kafka consumer code to be "restarted" 
manually. At the moment, on a cluster of 4 kafkas/3 zookeepers  and 3 client 
consumer apps, we have been trying to recreate this problem by causing 
rebalancing on purpose (group.id = salesforce-group-gb). You may notice other 
groups as well since this app is using multiple groups but those are ok at the 
moment afaik. 

File related to this comment is attached as bad-server-with-more-logging-1 
(tarballed and gzipped). Let me know if you want me to provide it in another 
format.
There are some comments inside that file (starting normally with the word 
COMMENT). 

Note: the app is using akka-kafka-streams and I believe there is a timeout 
involved. You will be able to see this if you search for WakeupException.
I am wondering if that is causing the issue because there is delay in getting 
the information back from kafka (timeout current set at 3s). I will go looking 
down that route in case this is just a long network call that route.




> One or more consumers in a consumer-group stop consuming after rebalancing
> --
>
> Key: KAFKA-5611
> URL: https://issues.apache.org/jira/browse/KAFKA-5611
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
>Reporter: Panos Skianis
> Attachments: bad-server-with-more-logging-1.tar.gz, kka02, Server 1, 
> Server 2, Server 3
>
>
> Scenario: 
>   - 3 zookeepers, 4 Kafkas. 0.10.2.0, with 0.9.0 compatibility still on 
> (other apps need it but the one mentioned below is already on kafka 0.10.2.0  
> client).
>   - 3 servers running 1 consumer each under the same consumer groupId. 
>   - Servers seem to be consuming messages happily but then there is a timeout 
> to an external service that causes our app to restart the Kafka Consumer on 
> one of the servers (this is by design). That causes rebalancing of the group 
> and upon restart of one of the Consumers seem to "block".
>   - Server 3 is where the problems occur.
>   - Problem fixes itself either by restarting one of the 3 servers or cause 
> the group to rebalance again by using the console consumer with the 
> autocommit set to false and using the same group.
>  
> Note: 
>  - Haven't managed to recreate it at will yet.
>  - Mainly happens in production environment, often enough. Hence I do not 
> have any logs with DEBUG/TRACE statements yet.
>  - Extracts from log of each app server are attached. Also the log of the 
> kafka that seems to be dealing with the related group and generations.
>  - See COMMENT lines in the files for further info.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-5611) One or more consumers in a consumer-group stop consuming after rebalancing

2017-07-21 Thread Panos Skianis (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16097048#comment-16097048
 ] 

Panos Skianis edited comment on KAFKA-5611 at 7/22/17 1:15 AM:
---

Somehow managed to recreate this on a different environment but stll cannot do 
it at will. I am attaching a filtered log from the "consumer" app that had the 
issue. 
Debugging was switched on for the consumer internals package of the kafka 
clients (o.a.k.c.consumer.internals). 

We have introduced a way to force the kafka consumer code to be "restarted" 
manually. At the moment, on a cluster of 4 kafkas/3 zookeepers  and 3 client 
consumer apps, we have been trying to recreate this problem by causing 
rebalancing on purpose (group.id = salesforce-group-gb). You may notice other 
groups as well since this app is using multiple groups but those are ok at the 
moment afaik. 

File related to this comment is attached as bad-server-with-more-logging-1 
(tarballed and gzipped). Let me know if you want me to provide it in another 
format.
There are some comments inside that file (starting normally with the word 
COMMENT). 

Note: the app is using akka-kafka-streams and I believe there is a timeout 
involved. You will be able to see this if you search for WakeupException.
I am wondering if that is causing the issue because there is delay in getting 
the information back from kafka (timeout current set at 3s). I will go looking 
down that route in case this is just a long network call that route.





was (Author: pskianis):
Somehow managed to recreate this on a different environment but stll cannot do 
it at will. I am attaching a filtered log from the "consumer" app that had the 
issue. 
Debugging was switched on for the consumer internals package of the kafka 
clients (o.a.k.c.consumer.internals). 

We have introduced a way to force the kafka consumer code to be "restarted" 
manually. At the moment, on a cluster of 4 kafkas/3 zookeepers  and 3 client 
consumer apps, we have been trying to recreate this problem by causing 
rebalancing on purpose (group.id = salesforce-group-gb). You may notice other 
groups as well since this app is using multiple groups but those are ok at the 
moment afaik. 

File related to this comment is attached as bad-server-with-more-logging-1 
(tarballed and gzipped). Let me know if you want me to provide it in another 
format.
There are some comments inside that file (starting normally with the word 
COMMENT). 

Note: the app is using akka-kafka-streams and I believe there is a timeout 
involved. You will be able to see this if you search for WakeupException.
I am wondering if that is causing the issue because there is delay in getting 
the information back from kafka (timeout current set at 3s). I will go looking 
down that route in case this is just a long network call that route.



> One or more consumers in a consumer-group stop consuming after rebalancing
> --
>
> Key: KAFKA-5611
> URL: https://issues.apache.org/jira/browse/KAFKA-5611
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
>Reporter: Panos Skianis
> Attachments: bad-server-with-more-logging-1.tar.gz, kka02, Server 1, 
> Server 2, Server 3
>
>
> Scenario: 
>   - 3 zookeepers, 4 Kafkas. 0.10.2.0, with 0.9.0 compatibility still on 
> (other apps need it but the one mentioned below is already on kafka 0.10.2.0  
> client).
>   - 3 servers running 1 consumer each under the same consumer groupId. 
>   - Servers seem to be consuming messages happily but then there is a timeout 
> to an external service that causes our app to restart the Kafka Consumer on 
> one of the servers (this is by design). That causes rebalancing of the group 
> and upon restart of one of the Consumers seem to "block".
>   - Server 3 is where the problems occur.
>   - Problem fixes itself either by restarting one of the 3 servers or cause 
> the group to rebalance again by using the console consumer with the 
> autocommit set to false and using the same group.
>  
> Note: 
>  - Haven't managed to recreate it at will yet.
>  - Mainly happens in production environment, often enough. Hence I do not 
> have any logs with DEBUG/TRACE statements yet.
>  - Extracts from log of each app server are attached. Also the log of the 
> kafka that seems to be dealing with the related group and generations.
>  - See COMMENT lines in the files for further info.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5611) One or more consumers in a consumer-group stop consuming after rebalancing

2017-07-21 Thread Panos Skianis (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panos Skianis updated KAFKA-5611:
-
Attachment: bad-server-with-more-logging-1.tar.gz

> One or more consumers in a consumer-group stop consuming after rebalancing
> --
>
> Key: KAFKA-5611
> URL: https://issues.apache.org/jira/browse/KAFKA-5611
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
>Reporter: Panos Skianis
> Attachments: bad-server-with-more-logging-1.tar.gz, kka02, Server 1, 
> Server 2, Server 3
>
>
> Scenario: 
>   - 3 zookeepers, 4 Kafkas. 0.10.2.0, with 0.9.0 compatibility still on 
> (other apps need it but the one mentioned below is already on kafka 0.10.2.0  
> client).
>   - 3 servers running 1 consumer each under the same consumer groupId. 
>   - Servers seem to be consuming messages happily but then there is a timeout 
> to an external service that causes our app to restart the Kafka Consumer on 
> one of the servers (this is by design). That causes rebalancing of the group 
> and upon restart of one of the Consumers seem to "block".
>   - Server 3 is where the problems occur.
>   - Problem fixes itself either by restarting one of the 3 servers or cause 
> the group to rebalance again by using the console consumer with the 
> autocommit set to false and using the same group.
>  
> Note: 
>  - Haven't managed to recreate it at will yet.
>  - Mainly happens in production environment, often enough. Hence I do not 
> have any logs with DEBUG/TRACE statements yet.
>  - Extracts from log of each app server are attached. Also the log of the 
> kafka that seems to be dealing with the related group and generations.
>  - See COMMENT lines in the files for further info.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5481) ListOffsetResponse isn't logged in the right way with trace level enabled

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16097009#comment-16097009
 ] 

ASF GitHub Bot commented on KAFKA-5481:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3383


> ListOffsetResponse isn't logged in the right way with trace level enabled
> -
>
> Key: KAFKA-5481
> URL: https://issues.apache.org/jira/browse/KAFKA-5481
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Reporter: Paolo Patierno
>Assignee: Paolo Patierno
> Fix For: 0.11.1.0
>
>
> Hi,
> when trace level is enabled, the ListOffsetResponse isn't logged well but 
> just the class name is showed in the log  :
> {code}
> [2017-06-20 14:18:50,724] TRACE Received ListOffsetResponse 
> org.apache.kafka.common.requests.ListOffsetResponse@7ed5ecd9 from broker 
> new-host:9092 (id: 0 rack: null) 
> (org.apache.kafka.clients.consumer.internals.Fetcher:674)
> {code}
> The class doesn't provide a toString() for such a thing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-4559) Add a site search bar on the Web site

2017-07-21 Thread Ravi Raj Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095728#comment-16095728
 ] 

Ravi Raj Singh edited comment on KAFKA-4559 at 7/21/17 9:48 PM:


Had a quick overview of the repo, based on that below will be my approach.
Since the most of the text was static I was thinking we can generate metadata 
for all the pages and load it when a page renders and provide client side 
search. 
Please let me know if you have any other suggestions.

[~derrickor] Please let me know your views.  I can work on implementing 
something server side if you have something like that in your mind.



was (Author: zer0id0l):
Had a quick overview of the repo, based on that below will be my approach.
Since the most of the text was static I was thinking we can generate metadata 
for all the pages and load it when a page renders and provide client side 
search. 
Please let me know if you have any other suggestions.

> Add a site search bar on the Web site
> -
>
> Key: KAFKA-4559
> URL: https://issues.apache.org/jira/browse/KAFKA-4559
> Project: Kafka
>  Issue Type: Bug
>  Components: website
>Reporter: Guozhang Wang
>Assignee: Ravi Raj Singh
>  Labels: newbie
>
> As titled, as we are breaking the "documentation" html into sub spaces and 
> sub pages, people cannot simply use `control + f` on that page, and a 
> site-scope search bar would help in this case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-3978) Cannot truncate to a negative offset (-1) exception at broker startup

2017-07-21 Thread Dominic Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096887#comment-16096887
 ] 

Dominic Evans edited comment on KAFKA-3978 at 7/21/17 9:21 PM:
---

[~ijuma] we actually hit this issue the other day on 0.10.2 and it seemed to 
have more serious repercussions than initially suggested here.

Because the truncateTo exception causes 
kafka.server.ReplicaManager.makeFollowers to exit, having removed any fetchers 
for the set of partitions passed in, we seem to end up with the broker never 
following/replicating those partitions again unless the "bad" partition 
(seemingly with an unknown checkpoint offset) is manually removed from ZK and 
the broker is restarted. 

I know that KIP-101 reworked this area, so it may no longer be possible under 
0.11 onwards. I also wasn't entirely sure how we got into the state and was 
unable to reproduce it by manually fiddling with replication-offset-checkpoint 
or otherwise.


was (Author: dnwe):
[~ijuma] we actually hit this issue the other day on 0.10.2 and it seemed to 
have more serious repercussions than initially suggested here.

Because the truncateTo exception causes 
kafka.server.ReplicaManager.makeFollowers to exit, having removed any fetchers 
for the set of partitions passed in, we seem to end up with the broker never 
following/replicating those partitions again unless the "bad" partition 
(seemingly with an unknown checkpoint offset) is manually removed from ZK and 
the broker is restarted. 

> Cannot truncate to a negative offset (-1) exception at broker startup
> -
>
> Key: KAFKA-3978
> URL: https://issues.apache.org/jira/browse/KAFKA-3978
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
> Environment: 3.13.0-87-generic 
>Reporter: Juho Mäkinen
>Priority: Critical
>  Labels: reliability, startup
>
> During broker startup sequence the broker server.log has this exception. 
> Problem persists after multiple restarts and also on another broker in the 
> cluster.
> {code}
> INFO [Socket Server on Broker 1002], Started 1 acceptor threads 
> (kafka.network.SocketServer)
> INFO [Socket Server on Broker 1002], Started 1 acceptor threads 
> (kafka.network.SocketServer)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [GroupCoordinator 1002]: Starting up. 
> (kafka.coordinator.GroupCoordinator)
> INFO [GroupCoordinator 1002]: Starting up. 
> (kafka.coordinator.GroupCoordinator)
> INFO [GroupCoordinator 1002]: Startup complete. 
> (kafka.coordinator.GroupCoordinator)
> INFO [GroupCoordinator 1002]: Startup complete. 
> (kafka.coordinator.GroupCoordinator)
> INFO [Group Metadata Manager on Broker 1002]: Removed 0 expired offsets in 9 
> milliseconds. (kafka.coordinator.GroupMetadataManager)
> INFO [Group Metadata Manager on Broker 1002]: Removed 0 expired offsets in 9 
> milliseconds. (kafka.coordinator.GroupMetadataManager)
> INFO [ThrottledRequestReaper-Produce], Starting  
> (kafka.server.ClientQuotaManager$ThrottledRequestReaper)
> INFO [ThrottledRequestReaper-Produce], Starting  
> (kafka.server.ClientQuotaManager$ThrottledRequestReaper)
> INFO [ThrottledRequestReaper-Fetch], Starting  
> (kafka.server.ClientQuotaManager$ThrottledRequestReaper)
> INFO [ThrottledRequestReaper-Fetch], Starting  
> (kafka.server.ClientQuotaManager$ThrottledRequestReaper)
> INFO Will not load MX4J, mx4j-tools.jar is not in the classpath 
> (kafka.utils.Mx4jLoader$)
> INFO Will not load MX4J, mx4j-tools.jar is not in the classpath 
> (kafka.utils.Mx4jLoader$)
> INFO Creating /brokers/ids/1002 (is it secure? false) 
> (kafka.utils.ZKCheckedEphemeral)
> INFO Creating /brokers/ids/1002 (is it secure? false) 
> (kafka.utils.ZKCheckedEphemeral)
> INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral)
> INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral)
> INFO Registered broker 1002 at path /brokers/ids/1002 with addresses: 
> 

[jira] [Commented] (KAFKA-3978) Cannot truncate to a negative offset (-1) exception at broker startup

2017-07-21 Thread Dominic Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096887#comment-16096887
 ] 

Dominic Evans commented on KAFKA-3978:
--

[~ijuma] we actually hit this issue the other day on 0.10.2 and it seemed to 
have more serious repercussions than initially suggested here.

Because the truncateTo exception causes 
kafka.server.ReplicaManager.makeFollowers to exit, having removed any fetchers 
for the set of partitions passed in, we seem to end up with the broker never 
following/replicating those partitions again unless the "bad" partition 
(seemingly with an unknown checkpoint offset) is manually removed from ZK and 
the broker is restarted. 

> Cannot truncate to a negative offset (-1) exception at broker startup
> -
>
> Key: KAFKA-3978
> URL: https://issues.apache.org/jira/browse/KAFKA-3978
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
> Environment: 3.13.0-87-generic 
>Reporter: Juho Mäkinen
>Priority: Critical
>  Labels: reliability, startup
>
> During broker startup sequence the broker server.log has this exception. 
> Problem persists after multiple restarts and also on another broker in the 
> cluster.
> {code}
> INFO [Socket Server on Broker 1002], Started 1 acceptor threads 
> (kafka.network.SocketServer)
> INFO [Socket Server on Broker 1002], Started 1 acceptor threads 
> (kafka.network.SocketServer)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [ExpirationReaper-1002], Starting  
> (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
> INFO [GroupCoordinator 1002]: Starting up. 
> (kafka.coordinator.GroupCoordinator)
> INFO [GroupCoordinator 1002]: Starting up. 
> (kafka.coordinator.GroupCoordinator)
> INFO [GroupCoordinator 1002]: Startup complete. 
> (kafka.coordinator.GroupCoordinator)
> INFO [GroupCoordinator 1002]: Startup complete. 
> (kafka.coordinator.GroupCoordinator)
> INFO [Group Metadata Manager on Broker 1002]: Removed 0 expired offsets in 9 
> milliseconds. (kafka.coordinator.GroupMetadataManager)
> INFO [Group Metadata Manager on Broker 1002]: Removed 0 expired offsets in 9 
> milliseconds. (kafka.coordinator.GroupMetadataManager)
> INFO [ThrottledRequestReaper-Produce], Starting  
> (kafka.server.ClientQuotaManager$ThrottledRequestReaper)
> INFO [ThrottledRequestReaper-Produce], Starting  
> (kafka.server.ClientQuotaManager$ThrottledRequestReaper)
> INFO [ThrottledRequestReaper-Fetch], Starting  
> (kafka.server.ClientQuotaManager$ThrottledRequestReaper)
> INFO [ThrottledRequestReaper-Fetch], Starting  
> (kafka.server.ClientQuotaManager$ThrottledRequestReaper)
> INFO Will not load MX4J, mx4j-tools.jar is not in the classpath 
> (kafka.utils.Mx4jLoader$)
> INFO Will not load MX4J, mx4j-tools.jar is not in the classpath 
> (kafka.utils.Mx4jLoader$)
> INFO Creating /brokers/ids/1002 (is it secure? false) 
> (kafka.utils.ZKCheckedEphemeral)
> INFO Creating /brokers/ids/1002 (is it secure? false) 
> (kafka.utils.ZKCheckedEphemeral)
> INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral)
> INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral)
> INFO Registered broker 1002 at path /brokers/ids/1002 with addresses: 
> PLAINTEXT -> EndPoint(172.16.2.22,9092,PLAINTEXT) (kafka.utils.ZkUtils)
> INFO Registered broker 1002 at path /brokers/ids/1002 with addresses: 
> PLAINTEXT -> EndPoint(172.16.2.22,9092,PLAINTEXT) (kafka.utils.ZkUtils)
> INFO Kafka version : 0.10.0.0 (org.apache.kafka.common.utils.AppInfoParser)
> INFO Kafka commitId : b8642491e78c5a13 
> (org.apache.kafka.common.utils.AppInfoParser)
> INFO [Kafka Server 1002], started (kafka.server.KafkaServer)
> INFO [Kafka Server 1002], started (kafka.server.KafkaServer)
> Error when handling request 
> {controller_id=1004,controller_epoch=1,partition_states=[..REALLY LONG OUTPUT 
> SNIPPED AWAY..], 
> live_leaders=[{id=1004,host=172.16.6.187,port=9092},{id=1003,host=172.16.2.21,port=9092}]}
>  (kafka.server.KafkaApis)
> ERROR java.lang.IllegalArgumentException: Cannot truncate to a negative 
> 

[jira] [Updated] (KAFKA-5625) Invalid subscription data may cause streams app to throw BufferUnderflowException

2017-07-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/KAFKA-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xavier Léauté updated KAFKA-5625:
-
Affects Version/s: 0.11.0.0
  Component/s: streams

> Invalid subscription data may cause streams app to throw 
> BufferUnderflowException
> -
>
> Key: KAFKA-5625
> URL: https://issues.apache.org/jira/browse/KAFKA-5625
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0
>Reporter: Xavier Léauté
>Priority: Minor
>
> I was able to cause my streams app to crash with the following error when 
> attempting to join the same consumer group with a rogue client.
> At the very least I would expect streams to throw a 
> {{TaskAssignmentException}} to indicate invalid subscription data.
> {code}
> java.nio.BufferUnderflowException
> at java.nio.Buffer.nextGetIndex(Buffer.java:506)
> at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:361)
> at 
> org.apache.kafka.streams.processor.internals.assignment.SubscriptionInfo.decode(SubscriptionInfo.java:97)
> at 
> org.apache.kafka.streams.processor.internals.StreamPartitionAssignor.assign(StreamPartitionAssignor.java:302)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.performAssignment(ConsumerCoordinator.java:365)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onJoinLeader(AbstractCoordinator.java:512)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.access$1100(AbstractCoordinator.java:93)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$JoinGroupResponseHandler.handle(AbstractCoordinator.java:462)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$JoinGroupResponseHandler.handle(AbstractCoordinator.java:445)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:798)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:778)
> at 
> org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:204)
> at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:167)
> at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:127)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:488)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:348)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:262)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:208)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:168)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:358)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:310)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:297)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1078)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:574)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:545)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:519)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5625) Invalid subscription data may cause streams app to throw BufferUnderflowException

2017-07-21 Thread JIRA
Xavier Léauté created KAFKA-5625:


 Summary: Invalid subscription data may cause streams app to throw 
BufferUnderflowException
 Key: KAFKA-5625
 URL: https://issues.apache.org/jira/browse/KAFKA-5625
 Project: Kafka
  Issue Type: Bug
Reporter: Xavier Léauté
Priority: Minor


I was able to cause my streams app to crash with the following error when 
attempting to join the same consumer group with a rogue client.

At the very least I would expect streams to throw a {{TaskAssignmentException}} 
to indicate invalid subscription data.

{code}
java.nio.BufferUnderflowException
at java.nio.Buffer.nextGetIndex(Buffer.java:506)
at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:361)
at 
org.apache.kafka.streams.processor.internals.assignment.SubscriptionInfo.decode(SubscriptionInfo.java:97)
at 
org.apache.kafka.streams.processor.internals.StreamPartitionAssignor.assign(StreamPartitionAssignor.java:302)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.performAssignment(ConsumerCoordinator.java:365)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onJoinLeader(AbstractCoordinator.java:512)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.access$1100(AbstractCoordinator.java:93)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$JoinGroupResponseHandler.handle(AbstractCoordinator.java:462)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$JoinGroupResponseHandler.handle(AbstractCoordinator.java:445)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:798)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:778)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:204)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:167)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:127)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:488)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:348)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:262)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:208)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:168)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:358)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:310)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:297)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1078)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043)
at 
org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:574)
at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:545)
at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:519)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3992) InstanceAlreadyExistsException Error for Consumers Starting in Parallel

2017-07-21 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096662#comment-16096662
 ] 

Guozhang Wang commented on KAFKA-3992:
--

[~cmolter] There are some discussions about whether we should recommend using 
same client-ids for different clients in 
https://github.com/apache/kafka/pull/3328. 

In general `client-ids` are used as distinguishing identifiers for client 
metrics (though in customizable reporters, users could choose to ignore the 
client-id to aggregate all clients into a single metric; in the default 
{{JmxReporter}} impl we choose to have one metric per client) and also for 
broker-side request logging, so users are suggested to use different values for 
different clients.

As for quotas, like mentioned in the PR we have been suggesting users to apply 
the "user id" as the principle if they want to have multiple clients sharing 
the same quota.

> InstanceAlreadyExistsException Error for Consumers Starting in Parallel
> ---
>
> Key: KAFKA-3992
> URL: https://issues.apache.org/jira/browse/KAFKA-3992
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0, 0.10.0.0
>Reporter: Alexander Cook
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.11.1.0
>
>
> I see the following error sometimes when I start multiple consumers at about 
> the same time in the same process (separate threads). Everything seems to 
> work fine afterwards, so should this not actually be an ERROR level message, 
> or could there be something going wrong that I don't see? 
> Let me know if I can provide any more info! 
> Error processing messages: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> org.apache.kafka.common.KafkaException: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
>  
> Caused by: javax.management.InstanceAlreadyExistsException: 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> Here is the full stack trace: 
> M[?:com.ibm.streamsx.messaging.kafka.KafkaConsumerV9.produceTuples:-1]  - 
> Error processing messages: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> org.apache.kafka.common.KafkaException: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
>   at 
> org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:159)
>   at 
> org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:77)
>   at 
> org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:288)
>   at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:177)
>   at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:162)
>   at 
> org.apache.kafka.common.network.Selector$SelectorMetrics.maybeRegisterConnectionMetrics(Selector.java:641)
>   at org.apache.kafka.common.network.Selector.poll(Selector.java:268)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:270)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:303)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:197)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:187)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:126)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorKnown(AbstractCoordinator.java:186)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:857)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:829)
>   at 
> com.ibm.streamsx.messaging.kafka.KafkaConsumerV9.produceTuples(KafkaConsumerV9.java:129)
>   at 
> com.ibm.streamsx.messaging.kafka.KafkaConsumerV9$1.run(KafkaConsumerV9.java:70)
>   at java.lang.Thread.run(Thread.java:785)
>   at 
> com.ibm.streams.operator.internal.runtime.OperatorThreadFactory$2.run(OperatorThreadFactory.java:137)
> Caused by: javax.management.InstanceAlreadyExistsException: 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
>   at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:449)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1910)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:978)
>   at 
> 

[jira] [Updated] (KAFKA-5615) Consumer hang in poll method while rebalancing is in progress (Duplicate)

2017-07-21 Thread Geoffrey Stewart (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Stewart updated KAFKA-5615:

Reviewer: Vahid Hashemian

> Consumer hang in poll method while rebalancing is in progress (Duplicate)
> -
>
> Key: KAFKA-5615
> URL: https://issues.apache.org/jira/browse/KAFKA-5615
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.2.0
>Reporter: Geoffrey Stewart
>
> Opening this issue to bring some attention to another Jira which may need to 
> be re-opened.  
> https://issues.apache.org/jira/browse/KAFKA-5016
> Was resolved as Not a Bug, but it may need to be re-opened for at least a 
> documentation bug.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5624) Unsafe use of expired sensors

2017-07-21 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-5624:
---
Description: 
Seems a couple unhandled cases following sensor expiration:

1. Static sensors (such as {{ClientQuotaManager.delayQueueSensor}}) can be 
expired due to inactivity, but the references will remain valid and usable. 
Probably a good idea to either ensure we use a "get or create" pattern when 
accessing the sensor or add a new static registration option which makes the 
sensor ineligible for expiration.
2. It is possible to register metrics through the sensor even after it is 
expired. We should probably raise an exception instead.

  was:
Seems a couple unhandled cases following sensor expiration:

1. Static sensors (such as {{ClientQuotaManager.}}) can be expired due to 
inactivity, but the references will remain valid and usable. Probably a good 
idea to either ensure we use a "get or create" pattern when accessing the 
sensor or add a new static registration option which makes the sensor 
ineligible for expiration.
2. It is possible to register metrics through the sensor even after it is 
expired. We should probably raise an exception instead.


> Unsafe use of expired sensors
> -
>
> Key: KAFKA-5624
> URL: https://issues.apache.org/jira/browse/KAFKA-5624
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>
> Seems a couple unhandled cases following sensor expiration:
> 1. Static sensors (such as {{ClientQuotaManager.delayQueueSensor}}) can be 
> expired due to inactivity, but the references will remain valid and usable. 
> Probably a good idea to either ensure we use a "get or create" pattern when 
> accessing the sensor or add a new static registration option which makes the 
> sensor ineligible for expiration.
> 2. It is possible to register metrics through the sensor even after it is 
> expired. We should probably raise an exception instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5624) Unsafe use of expired sensors

2017-07-21 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-5624:
--

 Summary: Unsafe use of expired sensors
 Key: KAFKA-5624
 URL: https://issues.apache.org/jira/browse/KAFKA-5624
 Project: Kafka
  Issue Type: Bug
Reporter: Jason Gustafson


Seems a couple unhandled cases following sensor expiration:

1. Static sensors (such as {{ClientQuotaManager.}}) can be expired due to 
inactivity, but the references will remain valid and usable. Probably a good 
idea to either ensure we use a "get or create" pattern when accessing the 
sensor or add a new static registration option which makes the sensor 
ineligible for expiration.
2. It is possible to register metrics through the sensor even after it is 
expired. We should probably raise an exception instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096406#comment-16096406
 ] 

ASF GitHub Bot commented on KAFKA-5512:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3442


> KafkaConsumer: High memory allocation rate when idle
> 
>
> Key: KAFKA-5512
> URL: https://issues.apache.org/jira/browse/KAFKA-5512
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.1.1
>Reporter: Stephane Roset
>  Labels: performance
> Fix For: 0.11.0.1, 0.11.1.0
>
>
> Hi,
> We noticed in our application that the memory allocation rate increased 
> significantly when we have no Kafka messages to consume. We isolated the 
> issue by using a JVM that simply runs 128 Kafka consumers. These consumers 
> consume 128 partitions (so each consumer consumes one partition). The 
> partitions are empty and no message has been sent during the test. The 
> consumers were configured with default values (session.timeout.ms=3, 
> fetch.max.wait.ms=500, receive.buffer.bytes=65536, 
> heartbeat.interval.ms=3000, max.poll.interval.ms=30, 
> max.poll.records=500). The Kafka cluster was made of 3 brokers. Within this 
> context, the allocation rate was about 55 MiB/s. This high allocation rate 
> generates a lot of GC activity (to garbage the young heap) and was an issue 
> for our project.
> We profiled the JVM with JProfiler. We noticed that there were a huge 
> quantity of ArrayList$Itr in memory. These collections were mainly 
> instantiated by the methods handleCompletedReceives, handleCompletedSends, 
> handleConnecions and handleDisconnections of the class NetWorkClient. We also 
> noticed that we had a lot of calls to the method pollOnce of the class 
> KafkaConsumer. 
> So we decided to run only one consumer and to profile the calls to the method 
> pollOnce. We noticed that regularly a huge number of calls is made to this 
> method, up to 268000 calls within 100ms. The pollOnce method calls the 
> NetworkClient.handle* methods. These methods iterate on collections (even if 
> they are empty), so that explains the huge number of iterators in memory.
> The large number of calls is related to the heartbeat mechanism. The pollOnce 
> method calculates the poll timeout; if a heartbeat needs to be done, the 
> timeout will be set to 0. The problem is that the heartbeat thread checks 
> every 100 ms (default value of retry.backoff.ms) if a heartbeat should be 
> sent, so the KafkaConsumer will call the poll method in a loop without 
> timeout until the heartbeat thread awakes. For example: the heartbeat thread 
> just started to wait and will awake in 99ms. So during 99ms, the 
> KafkaConsumer will call in a loop the pollOnce method and will use a timeout 
> of 0. That explains how we can have 268000 calls within 100ms. 
> The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so 
> I think the Kafka consumer should awake the heartbeat thread with a notify 
> when needed.
> We made two quick fixes to solve this issue:
>   - In NetworkClient.handle*(), we don't iterate on collections if they are 
> empty (to avoid unnecessary iterators instantiations).
>   - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify 
> the heartbeat thread to awake it (dirty fix because we don't handle the 
> autocommit case).
> With these 2 quick fixes and 128 consumers, the allocation rate drops down 
> from 55 MiB/s to 4 MiB/s.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-1421) Error: Could not find or load main class kafka.perf.SimpleConsumerPerformance

2017-07-21 Thread Daryl Erwin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryl Erwin resolved KAFKA-1421.

Resolution: Done

> Error: Could not find or load main class kafka.perf.SimpleConsumerPerformance
> -
>
> Key: KAFKA-1421
> URL: https://issues.apache.org/jira/browse/KAFKA-1421
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.0
>Reporter: Daryl Erwin
>Assignee: Jun Rao
>Priority: Minor
>  Labels: performance
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Did a base install with 
> sbt update
> sbt package
> I am able to successfully run the console-producer, consumer. I am trying to 
> run the perf scripts (./kafka-simple-consumer-perf-test.sh) but it appears 
> the jar file is not generated. 
> Are the steps that I need to run to create this jar file?
> .. same as:
> Error: Could not find or load main class kafka.perf.ProducerPerformance



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3992) InstanceAlreadyExistsException Error for Consumers Starting in Parallel

2017-07-21 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096330#comment-16096330
 ] 

Ismael Juma commented on KAFKA-3992:


https://github.com/apache/kafka/pull/3328 is related, but it seems to go in the 
other direction. cc [~guozhang] [~mjsax]

> InstanceAlreadyExistsException Error for Consumers Starting in Parallel
> ---
>
> Key: KAFKA-3992
> URL: https://issues.apache.org/jira/browse/KAFKA-3992
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0, 0.10.0.0
>Reporter: Alexander Cook
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.11.1.0
>
>
> I see the following error sometimes when I start multiple consumers at about 
> the same time in the same process (separate threads). Everything seems to 
> work fine afterwards, so should this not actually be an ERROR level message, 
> or could there be something going wrong that I don't see? 
> Let me know if I can provide any more info! 
> Error processing messages: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> org.apache.kafka.common.KafkaException: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
>  
> Caused by: javax.management.InstanceAlreadyExistsException: 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> Here is the full stack trace: 
> M[?:com.ibm.streamsx.messaging.kafka.KafkaConsumerV9.produceTuples:-1]  - 
> Error processing messages: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> org.apache.kafka.common.KafkaException: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
>   at 
> org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:159)
>   at 
> org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:77)
>   at 
> org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:288)
>   at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:177)
>   at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:162)
>   at 
> org.apache.kafka.common.network.Selector$SelectorMetrics.maybeRegisterConnectionMetrics(Selector.java:641)
>   at org.apache.kafka.common.network.Selector.poll(Selector.java:268)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:270)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:303)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:197)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:187)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:126)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorKnown(AbstractCoordinator.java:186)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:857)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:829)
>   at 
> com.ibm.streamsx.messaging.kafka.KafkaConsumerV9.produceTuples(KafkaConsumerV9.java:129)
>   at 
> com.ibm.streamsx.messaging.kafka.KafkaConsumerV9$1.run(KafkaConsumerV9.java:70)
>   at java.lang.Thread.run(Thread.java:785)
>   at 
> com.ibm.streams.operator.internal.runtime.OperatorThreadFactory$2.run(OperatorThreadFactory.java:137)
> Caused by: javax.management.InstanceAlreadyExistsException: 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
>   at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:449)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1910)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:978)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:912)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:336)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:534)
>   at 
> org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:157)
>   ... 18 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-3992) InstanceAlreadyExistsException Error for Consumers Starting in Parallel

2017-07-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3992:
---
Fix Version/s: 0.11.1.0

> InstanceAlreadyExistsException Error for Consumers Starting in Parallel
> ---
>
> Key: KAFKA-3992
> URL: https://issues.apache.org/jira/browse/KAFKA-3992
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0, 0.10.0.0
>Reporter: Alexander Cook
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.11.1.0
>
>
> I see the following error sometimes when I start multiple consumers at about 
> the same time in the same process (separate threads). Everything seems to 
> work fine afterwards, so should this not actually be an ERROR level message, 
> or could there be something going wrong that I don't see? 
> Let me know if I can provide any more info! 
> Error processing messages: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> org.apache.kafka.common.KafkaException: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
>  
> Caused by: javax.management.InstanceAlreadyExistsException: 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> Here is the full stack trace: 
> M[?:com.ibm.streamsx.messaging.kafka.KafkaConsumerV9.produceTuples:-1]  - 
> Error processing messages: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> org.apache.kafka.common.KafkaException: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
>   at 
> org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:159)
>   at 
> org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:77)
>   at 
> org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:288)
>   at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:177)
>   at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:162)
>   at 
> org.apache.kafka.common.network.Selector$SelectorMetrics.maybeRegisterConnectionMetrics(Selector.java:641)
>   at org.apache.kafka.common.network.Selector.poll(Selector.java:268)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:270)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:303)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:197)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:187)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:126)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorKnown(AbstractCoordinator.java:186)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:857)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:829)
>   at 
> com.ibm.streamsx.messaging.kafka.KafkaConsumerV9.produceTuples(KafkaConsumerV9.java:129)
>   at 
> com.ibm.streamsx.messaging.kafka.KafkaConsumerV9$1.run(KafkaConsumerV9.java:70)
>   at java.lang.Thread.run(Thread.java:785)
>   at 
> com.ibm.streams.operator.internal.runtime.OperatorThreadFactory$2.run(OperatorThreadFactory.java:137)
> Caused by: javax.management.InstanceAlreadyExistsException: 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
>   at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:449)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1910)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:978)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:912)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:336)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:534)
>   at 
> org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:157)
>   ... 18 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-3992) InstanceAlreadyExistsException Error for Consumers Starting in Parallel

2017-07-21 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096309#comment-16096309
 ] 

Helena Edelson edited comment on KAFKA-3992 at 7/21/17 2:34 PM:


Hi, I see this in 0.10.2.1 and I believe I saw it in 0.11, confirming...

But JMX should always be optional.


was (Author: helena_e):
Hi, I see this in 0.10.2.1 and I believe I saw it in 0.11, confirming...

> InstanceAlreadyExistsException Error for Consumers Starting in Parallel
> ---
>
> Key: KAFKA-3992
> URL: https://issues.apache.org/jira/browse/KAFKA-3992
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0, 0.10.0.0
>Reporter: Alexander Cook
>Assignee: Ewen Cheslack-Postava
>
> I see the following error sometimes when I start multiple consumers at about 
> the same time in the same process (separate threads). Everything seems to 
> work fine afterwards, so should this not actually be an ERROR level message, 
> or could there be something going wrong that I don't see? 
> Let me know if I can provide any more info! 
> Error processing messages: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> org.apache.kafka.common.KafkaException: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
>  
> Caused by: javax.management.InstanceAlreadyExistsException: 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> Here is the full stack trace: 
> M[?:com.ibm.streamsx.messaging.kafka.KafkaConsumerV9.produceTuples:-1]  - 
> Error processing messages: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> org.apache.kafka.common.KafkaException: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
>   at 
> org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:159)
>   at 
> org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:77)
>   at 
> org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:288)
>   at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:177)
>   at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:162)
>   at 
> org.apache.kafka.common.network.Selector$SelectorMetrics.maybeRegisterConnectionMetrics(Selector.java:641)
>   at org.apache.kafka.common.network.Selector.poll(Selector.java:268)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:270)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:303)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:197)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:187)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:126)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorKnown(AbstractCoordinator.java:186)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:857)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:829)
>   at 
> com.ibm.streamsx.messaging.kafka.KafkaConsumerV9.produceTuples(KafkaConsumerV9.java:129)
>   at 
> com.ibm.streamsx.messaging.kafka.KafkaConsumerV9$1.run(KafkaConsumerV9.java:70)
>   at java.lang.Thread.run(Thread.java:785)
>   at 
> com.ibm.streams.operator.internal.runtime.OperatorThreadFactory$2.run(OperatorThreadFactory.java:137)
> Caused by: javax.management.InstanceAlreadyExistsException: 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
>   at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:449)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1910)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:978)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:912)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:336)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:534)
>   at 
> org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:157)
>   ... 18 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5611) One or more consumers in a consumer-group stop consuming after rebalancing

2017-07-21 Thread Panos Skianis (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096311#comment-16096311
 ] 

Panos Skianis commented on KAFKA-5611:
--

[~huxi_2b] Yep, that seems to be the case indeed (and usually it is reflected 
in the assigned-partitions metric). I say usually because that hasn't been 
always the case. The log always shows what you have seen up to now but the 
metric is not always set to 0 when this issue happens (but that could be how 
metrics are being collected).

> One or more consumers in a consumer-group stop consuming after rebalancing
> --
>
> Key: KAFKA-5611
> URL: https://issues.apache.org/jira/browse/KAFKA-5611
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
>Reporter: Panos Skianis
> Attachments: kka02, Server 1, Server 2, Server 3
>
>
> Scenario: 
>   - 3 zookeepers, 4 Kafkas. 0.10.2.0, with 0.9.0 compatibility still on 
> (other apps need it but the one mentioned below is already on kafka 0.10.2.0  
> client).
>   - 3 servers running 1 consumer each under the same consumer groupId. 
>   - Servers seem to be consuming messages happily but then there is a timeout 
> to an external service that causes our app to restart the Kafka Consumer on 
> one of the servers (this is by design). That causes rebalancing of the group 
> and upon restart of one of the Consumers seem to "block".
>   - Server 3 is where the problems occur.
>   - Problem fixes itself either by restarting one of the 3 servers or cause 
> the group to rebalance again by using the console consumer with the 
> autocommit set to false and using the same group.
>  
> Note: 
>  - Haven't managed to recreate it at will yet.
>  - Mainly happens in production environment, often enough. Hence I do not 
> have any logs with DEBUG/TRACE statements yet.
>  - Extracts from log of each app server are attached. Also the log of the 
> kafka that seems to be dealing with the related group and generations.
>  - See COMMENT lines in the files for further info.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3992) InstanceAlreadyExistsException Error for Consumers Starting in Parallel

2017-07-21 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096309#comment-16096309
 ] 

Helena Edelson commented on KAFKA-3992:
---

Hi, I see this in 0.10.2.1 and I believe I saw it in 0.11, confirming...

> InstanceAlreadyExistsException Error for Consumers Starting in Parallel
> ---
>
> Key: KAFKA-3992
> URL: https://issues.apache.org/jira/browse/KAFKA-3992
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0, 0.10.0.0
>Reporter: Alexander Cook
>Assignee: Ewen Cheslack-Postava
>
> I see the following error sometimes when I start multiple consumers at about 
> the same time in the same process (separate threads). Everything seems to 
> work fine afterwards, so should this not actually be an ERROR level message, 
> or could there be something going wrong that I don't see? 
> Let me know if I can provide any more info! 
> Error processing messages: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> org.apache.kafka.common.KafkaException: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
>  
> Caused by: javax.management.InstanceAlreadyExistsException: 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> Here is the full stack trace: 
> M[?:com.ibm.streamsx.messaging.kafka.KafkaConsumerV9.produceTuples:-1]  - 
> Error processing messages: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> org.apache.kafka.common.KafkaException: Error registering mbean 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
>   at 
> org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:159)
>   at 
> org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:77)
>   at 
> org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:288)
>   at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:177)
>   at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:162)
>   at 
> org.apache.kafka.common.network.Selector$SelectorMetrics.maybeRegisterConnectionMetrics(Selector.java:641)
>   at org.apache.kafka.common.network.Selector.poll(Selector.java:268)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:270)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:303)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:197)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:187)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:126)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorKnown(AbstractCoordinator.java:186)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:857)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:829)
>   at 
> com.ibm.streamsx.messaging.kafka.KafkaConsumerV9.produceTuples(KafkaConsumerV9.java:129)
>   at 
> com.ibm.streamsx.messaging.kafka.KafkaConsumerV9$1.run(KafkaConsumerV9.java:70)
>   at java.lang.Thread.run(Thread.java:785)
>   at 
> com.ibm.streams.operator.internal.runtime.OperatorThreadFactory$2.run(OperatorThreadFactory.java:137)
> Caused by: javax.management.InstanceAlreadyExistsException: 
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
>   at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:449)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1910)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:978)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:912)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:336)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:534)
>   at 
> org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:157)
>   ... 18 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5608) System test failure due to timeout starting Jmx tool

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096194#comment-16096194
 ] 

ASF GitHub Bot commented on KAFKA-5608:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3553


> System test failure due to timeout starting Jmx tool
> 
>
> Key: KAFKA-5608
> URL: https://issues.apache.org/jira/browse/KAFKA-5608
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.11.0.1, 0.11.1.0
>
>
> Began seeing this in some failing system tests:
> {code}
> [INFO  - 2017-07-18 14:25:55,375 - background_thread - _protected_worker - 
> lineno:39]: Traceback (most recent call last):
>   File 
> "/home/jenkins/workspace/system-test-kafka-0.11.0/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/services/background_thread.py",
>  line 35, in _protected_worker
> self._worker(idx, node)
>   File 
> "/home/jenkins/workspace/system-test-kafka-0.11.0/kafka/tests/kafkatest/services/console_consumer.py",
>  line 261, in _worker
> self.start_jmx_tool(idx, node)
>   File 
> "/home/jenkins/workspace/system-test-kafka-0.11.0/kafka/tests/kafkatest/services/monitor/jmx.py",
>  line 73, in start_jmx_tool
> wait_until(lambda: self._jmx_has_output(node), timeout_sec=10, 
> backoff_sec=.5, err_msg="%s: Jmx tool took too long to start" % node.account)
>   File 
> "/home/jenkins/workspace/system-test-kafka-0.11.0/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/utils/util.py",
>  line 36, in wait_until
> raise TimeoutError(err_msg)
> TimeoutError: ubuntu@worker7: Jmx tool took too long to start
> {code}
> This is immediately followed by a consumer timeout in the failing cases:
> {code}
> [INFO  - 2017-07-18 14:26:46,907 - runner_client - log - lineno:221]: 
> RunnerClient: 
> kafkatest.tests.core.security_rolling_upgrade_test.TestSecurityRollingUpgrade.test_rolling_upgrade_phase_two.broker_protocol=SASL_SSL.client_protocol=SASL_SSL:
>  FAIL: Consumer failed to consume messages for 60s.
> Traceback (most recent call last):
>   File 
> "/home/jenkins/workspace/system-test-kafka-0.11.0/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 123, in run
> data = self.run_test()
>   File 
> "/home/jenkins/workspace/system-test-kafka-0.11.0/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 176, in run_test
> return self.test_context.function(self.test)
>   File 
> "/home/jenkins/workspace/system-test-kafka-0.11.0/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/mark/_mark.py",
>  line 321, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/home/jenkins/workspace/system-test-kafka-0.11.0/kafka/tests/kafkatest/tests/core/security_rolling_upgrade_test.py",
>  line 148, in test_rolling_upgrade_phase_two
> self.run_produce_consume_validate(self.roll_in_secured_settings, 
> client_protocol, broker_protocol)
>   File 
> "/home/jenkins/workspace/system-test-kafka-0.11.0/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 106, in run_produce_consume_validate
> self.start_producer_and_consumer()
>   File 
> "/home/jenkins/workspace/system-test-kafka-0.11.0/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 79, in start_producer_and_consumer
> self.consumer_start_timeout_sec)
>   File 
> "/home/jenkins/workspace/system-test-kafka-0.11.0/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/utils/util.py",
>  line 36, in wait_until
> raise TimeoutError(err_msg)
> TimeoutError: Consumer failed to consume messages for 60s.
> {code}
> There does not appear to be anything wrong with the consumer in the logs, so 
> the timeout seems to be caused by the Jmx tool timeout.
> Possibly due to https://github.com/apache/kafka/pull/3447/files?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3741) Allow setting of default topic configs via StreamsConfig

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096142#comment-16096142
 ] 

ASF GitHub Bot commented on KAFKA-3741:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3459


> Allow setting of default topic configs via StreamsConfig
> 
>
> Key: KAFKA-3741
> URL: https://issues.apache.org/jira/browse/KAFKA-3741
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Roger Hoover
>Assignee: Damian Guy
>  Labels: api
> Fix For: 0.11.1.0
>
>
> Kafka Streams currently allows you to specify a replication factor for 
> changelog and repartition topics that it creates.  It should also allow you 
> to specify any other TopicConfig. These should be used as defaults when 
> creating Internal topics. The defaults should be overridden by any configs 
> provided by the StateStoreSuppliers etc.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-3741) Allow setting of default topic configs via StreamsConfig

2017-07-21 Thread Damian Guy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Guy resolved KAFKA-3741.
---
   Resolution: Fixed
Fix Version/s: 0.11.1.0

Issue resolved by pull request 3459
[https://github.com/apache/kafka/pull/3459]

> Allow setting of default topic configs via StreamsConfig
> 
>
> Key: KAFKA-3741
> URL: https://issues.apache.org/jira/browse/KAFKA-3741
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Roger Hoover
>Assignee: Damian Guy
>  Labels: api
> Fix For: 0.11.1.0
>
>
> Kafka Streams currently allows you to specify a replication factor for 
> changelog and repartition topics that it creates.  It should also allow you 
> to specify any other TopicConfig. These should be used as defaults when 
> creating Internal topics. The defaults should be overridden by any configs 
> provided by the StateStoreSuppliers etc.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5618) Kafka stream not receive any topic/partitions/records info

2017-07-21 Thread Yogesh BG (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096112#comment-16096112
 ] 

Yogesh BG commented on KAFKA-5618:
--

I enabled debug and that is all the logs I got and there are no records
present. its fresh deployment without sending any data




> Kafka stream not receive any topic/partitions/records info
> --
>
> Key: KAFKA-5618
> URL: https://issues.apache.org/jira/browse/KAFKA-5618
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Yogesh BG
>Priority: Critical
> Attachments: rtp-kafkastreams2.log, rtp-kafkastreams3.log, 
> rtp-kafkastreams.log
>
>
> I have 3 brokers and 3 stream consumers.
> I have there are 360 partitions and not able to bring up streams successfully 
> even after several retry.
> I have attached the logs. 
> There are other topics which are having around 16 partitions and they are 
> able to successfully be consumed by kafka client
> when tried getting thread dump using jstack the process is not responding
> Attaching to process ID 10663, please wait...
> Debugger attached successfully.
> Server compiler detected.
> JVM version is 24.141-b02
> Deadlock Detection:
> java.lang.RuntimeException: Unable to deduce type of thread from address 
> 0x7fdac4009000 (expected type JavaThread, CompilerThread, ServiceThread, 
> JvmtiAgentThread, or SurrogateLockerThread)
> at 
> sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:162)
> at sun.jvm.hotspot.runtime.Threads.first(Threads.java:150)
> at 
> sun.jvm.hotspot.runtime.DeadlockDetector.createThreadTable(DeadlockDetector.java:149)
> at 
> sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:56)
> at 
> sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:39)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5007) Kafka Replica Fetcher Thread- Resource Leak

2017-07-21 Thread zhaojianbo (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096111#comment-16096111
 ] 

zhaojianbo commented on KAFKA-5007:
---

@Joseph Aliase 
Is there any conclusion now? Did you find the reason?

> Kafka Replica Fetcher Thread- Resource Leak
> ---
>
> Key: KAFKA-5007
> URL: https://issues.apache.org/jira/browse/KAFKA-5007
> Project: Kafka
>  Issue Type: Bug
>  Components: core, network
>Affects Versions: 0.10.0.0, 0.10.1.1, 0.10.2.0
> Environment: Centos 7
> Jave 8
>Reporter: Joseph Aliase
>Priority: Critical
>  Labels: reliability
> Attachments: jstack-kafka.out, jstack-zoo.out, lsofkafka.txt, 
> lsofzookeeper.txt
>
>
> Kafka is running out of open file descriptor when system network interface is 
> done.
> Issue description:
> We have a Kafka Cluster of 5 node running on version 0.10.1.1. The open file 
> descriptor for the account running Kafka is set to 10.
> During an upgrade, network interface went down. Outage continued for 12 hours 
> eventually all the broker crashed with java.io.IOException: Too many open 
> files error.
> We repeated the test in a lower environment and observed that Open Socket 
> count keeps on increasing while the NIC is down.
> We have around 13 topics with max partition size of 120 and number of replica 
> fetcher thread is set to 8.
> Using an internal monitoring tool we observed that Open Socket descriptor   
> for the broker pid continued to increase although NIC was down leading to  
> Open File descriptor error. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5563) Clarify handling of connector name in config

2017-07-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/KAFKA-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096076#comment-16096076
 ] 

Sönke Liebau commented on KAFKA-5563:
-

Alright, I'll prepare a PR to introduce this check once  
[2755|https://github.com/apache/kafka/pull/2755] has been reviewed and merged, 
as this touches exactly the same code and it would make merging easier to base 
it on that PR I think.

> Clarify handling of connector name in config 
> -
>
> Key: KAFKA-5563
> URL: https://issues.apache.org/jira/browse/KAFKA-5563
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.11.0.0
>Reporter: Sönke Liebau
>Priority: Minor
>
> The connector name is currently being stored in two places, once at the root 
> level of the connector and once in the config:
> {code:java}
> {
>   "name": "test",
>   "config": {
>   "connector.class": 
> "org.apache.kafka.connect.tools.MockSinkConnector",
>   "tasks.max": "3",
>   "topics": "test-topic",
>   "name": "test"
>   },
>   "tasks": [
>   {
>   "connector": "test",
>   "task": 0
>   }
>   ]
> }
> {code}
> If no name is provided in the "config" element, then the name from the root 
> level is [copied there when the connector is being 
> created|https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectorsResource.java#L95].
>  If however a name is provided in the config then it is not touched, which 
> means it is possible to create a connector with a different name at the root 
> level and in the config like this:
> {code:java}
> {
>   "name": "test1",
>   "config": {
>   "connector.class": 
> "org.apache.kafka.connect.tools.MockSinkConnector",
>   "tasks.max": "3",
>   "topics": "test-topic",
>   "name": "differentname"
>   },
>   "tasks": [
>   {
>   "connector": "test1",
>   "task": 0
>   }
>   ]
> }
> {code}
> I am not aware of any issues that this currently causes, but it is at least 
> confusing and probably not intended behavior and definitely bears potential 
> for bugs, if different functions take the name from different places.
> Would it make sense to add a check to reject requests that provide different 
> names in the request and the config section?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-4915) Make logging consistent

2017-07-21 Thread Andrey (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096059#comment-16096059
 ] 

Andrey edited comment on KAFKA-4915 at 7/21/17 9:44 AM:


Not really. You log the message with "WARN" log level. However the first word 
in message is "Error". If it is really error, then it should have log level 
"ERROR". If it's not an error, it should not have "Error" in the log message.

Sorry for such boring jira, but it makes sense to grep "Error" keyword and get 
only errors. + Setup monitoring so it can highlight errors with red and 
warnings in yellow.

And it's definitely not a major issue.

I suggest:
- replace "Error" keyword with something neutral like "Unable"


was (Author: dernasherbrezon):
Not really. You log message with "WARN" log level. However the first word in 
message is "Error". If it is really error, then it should have log level 
"ERROR". If it's not an error, it should not have "Error" in the log message.

Sorry for such boring jira, but it makes sense to grep "Error" keyword and get 
only errors. + Setup monitoring so it can highlight errors with red and 
warnings in yellow.

And it's definitely not a major issue

> Make logging consistent
> ---
>
> Key: KAFKA-4915
> URL: https://issues.apache.org/jira/browse/KAFKA-4915
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.1
>Reporter: Andrey
>
> Currently we see in logs:
> {code}
> WARN Error while fetching metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> Expected:
> {code}
> WARN unable to fetch metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> or
> {code}
> ERROR unable to fetch metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> See for reference: 
> https://github.com/apache/kafka/blob/65650ba4dcba8a9729cb9cb6477a62a7b7c3714e/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java#L723



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-4915) Make logging consistent

2017-07-21 Thread Andrey (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096059#comment-16096059
 ] 

Andrey edited comment on KAFKA-4915 at 7/21/17 9:36 AM:


Not really. You log message with "WARN" log level. However the first word in 
message is "Error". If it is really error, then it should have log level 
"ERROR". If it's not an error, it should not have "Error" in the log message.

Sorry for such boring jira, but it makes sense to grep "Error" keyword and get 
only errors. + Setup monitoring so it can highlight errors with red and 
warnings in yellow.

And it's definitely not a major issue


was (Author: dernasherbrezon):
Not really. You log message with "WARN" log level. However the first word in 
message is "Error". If it is really error, then it should have log level 
"ERROR". If it's not an error, it should not have "Error" in the log message.

Sorry for such boring jira, but it makes sense to grep "Error" keyword and get 
only errors. + Setup monitoring so it can highlight errors with red and 
warnings in yellow.

> Make logging consistent
> ---
>
> Key: KAFKA-4915
> URL: https://issues.apache.org/jira/browse/KAFKA-4915
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.1
>Reporter: Andrey
>
> Currently we see in logs:
> {code}
> WARN Error while fetching metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> Expected:
> {code}
> WARN unable to fetch metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> or
> {code}
> ERROR unable to fetch metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> See for reference: 
> https://github.com/apache/kafka/blob/65650ba4dcba8a9729cb9cb6477a62a7b7c3714e/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java#L723



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4915) Make logging consistent

2017-07-21 Thread Andrey (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096059#comment-16096059
 ] 

Andrey commented on KAFKA-4915:
---

Not really. You log message with "WARN" log level. However the first word in 
message is "Error". If it is really error, then it should have log level 
"ERROR". If it's not an error, it should not have "Error" in the log message.

Sorry for such boring jira, but it makes sense to grep "Error" keyword and get 
only errors. + Setup monitoring so it can highlight errors with red and 
warnings in yellow.

> Make logging consistent
> ---
>
> Key: KAFKA-4915
> URL: https://issues.apache.org/jira/browse/KAFKA-4915
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.1
>Reporter: Andrey
>
> Currently we see in logs:
> {code}
> WARN Error while fetching metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> Expected:
> {code}
> WARN unable to fetch metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> or
> {code}
> ERROR unable to fetch metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> See for reference: 
> https://github.com/apache/kafka/blob/65650ba4dcba8a9729cb9cb6477a62a7b7c3714e/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java#L723



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5611) One or more consumers in a consumer-group stop consuming after rebalancing

2017-07-21 Thread huxihx (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095916#comment-16095916
 ] 

huxihx commented on KAFKA-5611:
---

[~pskianis] In the log, seems that the consumer on Server 3 did not get any 
assigned partitions with generation ID 12470.

> One or more consumers in a consumer-group stop consuming after rebalancing
> --
>
> Key: KAFKA-5611
> URL: https://issues.apache.org/jira/browse/KAFKA-5611
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
>Reporter: Panos Skianis
> Attachments: kka02, Server 1, Server 2, Server 3
>
>
> Scenario: 
>   - 3 zookeepers, 4 Kafkas. 0.10.2.0, with 0.9.0 compatibility still on 
> (other apps need it but the one mentioned below is already on kafka 0.10.2.0  
> client).
>   - 3 servers running 1 consumer each under the same consumer groupId. 
>   - Servers seem to be consuming messages happily but then there is a timeout 
> to an external service that causes our app to restart the Kafka Consumer on 
> one of the servers (this is by design). That causes rebalancing of the group 
> and upon restart of one of the Consumers seem to "block".
>   - Server 3 is where the problems occur.
>   - Problem fixes itself either by restarting one of the 3 servers or cause 
> the group to rebalance again by using the console consumer with the 
> autocommit set to false and using the same group.
>  
> Note: 
>  - Haven't managed to recreate it at will yet.
>  - Mainly happens in production environment, often enough. Hence I do not 
> have any logs with DEBUG/TRACE statements yet.
>  - Extracts from log of each app server are attached. Also the log of the 
> kafka that seems to be dealing with the related group and generations.
>  - See COMMENT lines in the files for further info.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5067) java.sql.SQLDataException on TimeStamp column when using AWS Redshift as a JDBC source

2017-07-21 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095854#comment-16095854
 ] 

Ewen Cheslack-Postava commented on KAFKA-5067:
--

This also looks like it has more to do with the [the JDBC 
connector|https://github.com/confluentinc/kafka-connect-jdbc] than with the 
Kafka Connect framework. It's likely that any bug here should be filed for the 
connector rather than the framework.

> java.sql.SQLDataException on TimeStamp column when using AWS Redshift as a 
> JDBC source
> --
>
> Key: KAFKA-5067
> URL: https://issues.apache.org/jira/browse/KAFKA-5067
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Curtis Wilde
>Priority: Minor
>
> Kafka Connect throws java.sql.SQLDataException when attempting to use 
> Redshift as a data source.
> When I run the query "select CURRENT_TIMESTAMP;" in a SQL editor it returns:
> 2017-04-13 16:11:25.204925+00
> Full stack trace:
> [2017-04-13 09:44:09,910] ERROR Failed to get current time from DB using 
> query select CURRENT_TIMESTAMP; on database PostgreSQL 
> (io.confluent.connect.jdbc.util.JdbcUtils:205)
> java.sql.SQLDataException: [Amazon][JDBC](10140) Error converting value to 
> Timestamp.
>   at com.amazon.exceptions.ExceptionConverter.toSQLException(Unknown 
> Source)
>   at 
> com.amazon.utilities.conversion.TypeConverter.convertToTimestamp(Unknown 
> Source)
>   at com.amazon.utilities.conversion.TypeConverter.toTimestamp(Unknown 
> Source)
>   at com.amazon.jdbc.common.SForwardResultSet.getTimestamp(Unknown Source)
>   at 
> io.confluent.connect.jdbc.util.JdbcUtils.getCurrentTimeOnDB(JdbcUtils.java:201)
>   at 
> io.confluent.connect.jdbc.source.TimestampIncrementingTableQuerier.executeQuery(TimestampIncrementingTableQuerier.java:169)
>   at 
> io.confluent.connect.jdbc.source.TableQuerier.maybeStartQuery(TableQuerier.java:84)
>   at 
> io.confluent.connect.jdbc.source.TimestampIncrementingTableQuerier.maybeStartQuery(TimestampIncrementingTableQuerier.java:55)
>   at 
> io.confluent.connect.jdbc.source.JdbcSourceTask.poll(JdbcSourceTask.java:200)
>   at 
> org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:162)
>   at 
> org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
>   at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> [2017-04-13 09:44:09,912] ERROR Failed to run query for table 
> TimestampIncrementingTableQuerier{name='null', query='', 
> topicPrefix='', timestampColumn='', 
> incrementingColumn='null'}: {} 
> (io.confluent.connect.jdbc.source.JdbcSourceTask:221)
> java.sql.SQLDataException: [Amazon][JDBC](10140) Error converting value to 
> Timestamp.
>   at com.amazon.exceptions.ExceptionConverter.toSQLException(Unknown 
> Source)
>   at 
> com.amazon.utilities.conversion.TypeConverter.convertToTimestamp(Unknown 
> Source)
>   at com.amazon.utilities.conversion.TypeConverter.toTimestamp(Unknown 
> Source)
>   at com.amazon.jdbc.common.SForwardResultSet.getTimestamp(Unknown Source)
>   at 
> io.confluent.connect.jdbc.util.JdbcUtils.getCurrentTimeOnDB(JdbcUtils.java:201)
>   at 
> io.confluent.connect.jdbc.source.TimestampIncrementingTableQuerier.executeQuery(TimestampIncrementingTableQuerier.java:169)
>   at 
> io.confluent.connect.jdbc.source.TableQuerier.maybeStartQuery(TableQuerier.java:84)
>   at 
> io.confluent.connect.jdbc.source.TimestampIncrementingTableQuerier.maybeStartQuery(TimestampIncrementingTableQuerier.java:55)
>   at 
> io.confluent.connect.jdbc.source.JdbcSourceTask.poll(JdbcSourceTask.java:200)
>   at 
> org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:162)
>   at 
> org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
>   at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> [2017-04-13 

[jira] [Updated] (KAFKA-5621) The producer should retry expired batches when retries are enabled

2017-07-21 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5621:
---
Fix Version/s: 0.11.1.0

> The producer should retry expired batches when retries are enabled
> --
>
> Key: KAFKA-5621
> URL: https://issues.apache.org/jira/browse/KAFKA-5621
> Project: Kafka
>  Issue Type: Bug
>Reporter: Apurva Mehta
> Fix For: 0.11.1.0
>
>
> Today, when a batch is expired in the accumulator, a {{TimeoutException}} is 
> raised to the user.
> It might be better the producer to retry the expired batch rather up to the 
> configured number of retries. This is more intuitive from the user's point of 
> view. 
> Further the proposed behavior makes it easier for applications like mirror 
> maker to provide ordering guarantees even when batches expire. Today, they 
> would resend the expired batch and it would get added to the back of the 
> queue, causing the output ordering to be different from the input ordering.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5089) JAR mismatch in KafkaConnect leads to NoSuchMethodError in HDP 2.6

2017-07-21 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-5089.
--
Resolution: Fixed

Going to close this since it should be resolved by 
[KIP-146|https://cwiki.apache.org/confluence/display/KAFKA/KIP-146+-+Classloading+Isolation+in+Connect]
 which provides better classloader isolation. Please reopen if this is still an 
issue even after that feature was added.

> JAR mismatch in KafkaConnect leads to NoSuchMethodError in HDP 2.6
> --
>
> Key: KAFKA-5089
> URL: https://issues.apache.org/jira/browse/KAFKA-5089
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.1
> Environment: HDP 2.6, Centos 7.3.1611, 
> kafka-0.10.1.2.6.0.3-8.el6.noarch
>Reporter: Christoph Körner
>
> When I follow the steps on the Getting Started Guide of KafkaConnect 
> (https://kafka.apache.org/quickstart#quickstart_kafkaconnect), it throws an 
> NoSuchMethodError error. 
> {code:borderStyle=solid}
> [root@devbox kafka-broker]# ./bin/connect-standalone.sh 
> config/connect-standalone.properties config/connect-file-source.properties 
> config/ connect-file-sink.properties
> [2017-04-19 14:38:36,583] INFO StandaloneConfig values:
> access.control.allow.methods =
> access.control.allow.origin =
> bootstrap.servers = [localhost:6667]
> internal.key.converter = class 
> org.apache.kafka.connect.json.JsonConverter
> internal.value.converter = class 
> org.apache.kafka.connect.json.JsonConverter
> key.converter = class org.apache.kafka.connect.json.JsonConverter
> offset.flush.interval.ms = 1
> offset.flush.timeout.ms = 5000
> offset.storage.file.filename = /tmp/connect.offsets
> rest.advertised.host.name = null
> rest.advertised.port = null
> rest.host.name = null
> rest.port = 8083
> task.shutdown.graceful.timeout.ms = 5000
> value.converter = class org.apache.kafka.connect.json.JsonConverter
>  (org.apache.kafka.connect.runtime.standalone.StandaloneConfig:180)
> [2017-04-19 14:38:36,756] INFO Logging initialized @714ms 
> (org.eclipse.jetty.util.log:186)
> [2017-04-19 14:38:36,871] INFO Kafka Connect starting 
> (org.apache.kafka.connect.runtime.Connect:52)
> [2017-04-19 14:38:36,872] INFO Herder starting 
> (org.apache.kafka.connect.runtime.standalone.StandaloneHerder:70)
> [2017-04-19 14:38:36,872] INFO Worker starting 
> (org.apache.kafka.connect.runtime.Worker:114)
> [2017-04-19 14:38:36,873] INFO Starting FileOffsetBackingStore with file 
> /tmp/connect.offsets 
> (org.apache.kafka.connect.storage.FileOffsetBackingStore:60)
> [2017-04-19 14:38:36,877] INFO Worker started 
> (org.apache.kafka.connect.runtime.Worker:119)
> [2017-04-19 14:38:36,878] INFO Herder started 
> (org.apache.kafka.connect.runtime.standalone.StandaloneHerder:72)
> [2017-04-19 14:38:36,878] INFO Starting REST server 
> (org.apache.kafka.connect.runtime.rest.RestServer:98)
> [2017-04-19 14:38:37,077] INFO jetty-9.2.15.v20160210 
> (org.eclipse.jetty.server.Server:327)
> [2017-04-19 14:38:37,154] WARN FAILED 
> o.e.j.s.ServletContextHandler@3c46e67a{/,null,STARTING}: 
> java.lang.NoSuchMethodError: 
> javax.ws.rs.core.Application.getProperties()Ljava/util/Map; 
> (org.eclipse.jetty.util.component.AbstractLifeCycle:212)
> java.lang.NoSuchMethodError: 
> javax.ws.rs.core.Application.getProperties()Ljava/util/Map;
> at 
> org.glassfish.jersey.server.ApplicationHandler.(ApplicationHandler.java:331)
> at 
> org.glassfish.jersey.servlet.WebComponent.(WebComponent.java:392)
> at 
> org.glassfish.jersey.servlet.ServletContainer.init(ServletContainer.java:177)
> at 
> org.glassfish.jersey.servlet.ServletContainer.init(ServletContainer.java:369)
> at javax.servlet.GenericServlet.init(GenericServlet.java:241)
> at 
> org.eclipse.jetty.servlet.ServletHolder.initServlet(ServletHolder.java:616)
> at 
> org.eclipse.jetty.servlet.ServletHolder.initialize(ServletHolder.java:396)
> at 
> org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:871)
> at 
> org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:298)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:741)
> at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
> at 
> org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:132)
> at 
> org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:114)
> at 
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61)
>   

[jira] [Assigned] (KAFKA-4930) Connect Rest API allows creating connectors with an empty name

2017-07-21 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava reassigned KAFKA-4930:


Assignee: Sönke Liebau

> Connect Rest API allows creating connectors with an empty name
> --
>
> Key: KAFKA-4930
> URL: https://issues.apache.org/jira/browse/KAFKA-4930
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Sönke Liebau
>Assignee: Sönke Liebau
>Priority: Minor
>
> The Connect Rest API allows to deploy connectors with an empty name field, 
> which then cannot be removed through the api.
> Sending the following request:
> {code}
> {
> "name": "",
> "config": {
> "connector.class": 
> "org.apache.kafka.connect.tools.MockSourceConnector",
> "tasks.max": "1",
> "topics": "test-topic"
> 
> }
> }
> {code}
> Results in a connector being deployed which can be seen in the list of 
> connectors:
> {code}
> [
>   "",
>   "testconnector"
> ]{code}
> But cannot be removed via a DELETE call, as the api thinks we are trying to 
> delete the /connectors endpoint and declines the request.
> I don't think there is a valid case for the connector name to be empty so 
> perhaps we should add a check for this. I am happy to work on this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5000) Framework should log some progress information regularly to assist in troubleshooting

2017-07-21 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095830#comment-16095830
 ] 

Ewen Cheslack-Postava commented on KAFKA-5000:
--

Should this be logs or just covered by the metrics that, while still missing, 
we want to add?

> Framework should log some progress information regularly to assist in 
> troubleshooting
> -
>
> Key: KAFKA-5000
> URL: https://issues.apache.org/jira/browse/KAFKA-5000
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Gwen Shapira
>
> We get many questions of the type:
> "I started a connector, it doesn't seem to make any progress, I don't know 
> what to do"
> I think that periodic "progress reports" on the worker logs may help. 
> We have the offset commit message: "INFO 
> WorkerSinkTask{id=cassandra-sink-connector-0} Committing offsets 
> (org.apache.kafka.connect.runtime.WorkerSinkTask:244)"
> But I think we'd also want to know: topic, partition, offsets, how many rows 
> were read from source/kafka and how many were successfully written.
> This will help determine if there is any progress being made and whether some 
> partitions are "stuck" for some reason.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4915) Make logging consistent

2017-07-21 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095825#comment-16095825
 ] 

Ewen Cheslack-Postava commented on KAFKA-4915:
--

[~dernasherbrezon] Can you clarify the problem? Is the issue that there's an 
extra newline in the log somehow that is causing the info about the source of 
the error to be on a different line?

> Make logging consistent
> ---
>
> Key: KAFKA-4915
> URL: https://issues.apache.org/jira/browse/KAFKA-4915
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.1
>Reporter: Andrey
>
> Currently we see in logs:
> {code}
> WARN Error while fetching metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> Expected:
> {code}
> WARN unable to fetch metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> or
> {code}
> ERROR unable to fetch metadata with correlation id 659455 : 
> {kafka-connect-offsets=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient:600)
> {code}
> See for reference: 
> https://github.com/apache/kafka/blob/65650ba4dcba8a9729cb9cb6477a62a7b7c3714e/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java#L723



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)