[jira] [Commented] (KAFKA-6338) java.lang.NoClassDefFoundError: org/apache/kafka/common/network/LoginType

2017-12-08 Thread Ronald van de Kuil (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284657#comment-16284657
 ] 

Ronald van de Kuil commented on KAFKA-6338:
---

[2017-12-09 06:54:22,233] ERROR Error getting principal. 
(org.apache.ranger.authorization.kafka.authorizer.RangerKafkaAuthorizer)
java.lang.NoClassDefFoundError: org/apache/kafka/common/network/LoginType
at 
org.apache.ranger.authorization.kafka.authorizer.RangerKafkaAuthorizer.configure(RangerKafkaAuthorizer.java:82)
at 
org.apache.ranger.authorization.kafka.authorizer.RangerKafkaAuthorizer.configure(RangerKafkaAuthorizer.java:94)
at kafka.server.KafkaServer.$anonfun$startup$4(KafkaServer.scala:254)
at scala.Option.map(Option.scala:146)
at kafka.server.KafkaServer.startup(KafkaServer.scala:252)
at 
kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
at kafka.Kafka$.main(Kafka.scala:92)
at kafka.Kafka.main(Kafka.scala)
Caused by: java.lang.ClassNotFoundException: 
org.apache.kafka.common.network.LoginType
at java.lang.ClassLoader.findClass(ClassLoader.java:530)
at 
org.apache.ranger.plugin.classloader.RangerPluginClassLoader$MyClassLoader.findClass(RangerPluginClassLoader.java:272)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at 
org.apache.ranger.plugin.classloader.RangerPluginClassLoader.loadClass(RangerPluginClassLoader.java:125)
... 8 more


> java.lang.NoClassDefFoundError: org/apache/kafka/common/network/LoginType
> -
>
> Key: KAFKA-6338
> URL: https://issues.apache.org/jira/browse/KAFKA-6338
> Project: Kafka
>  Issue Type: Test
>Affects Versions: 1.0.0
>Reporter: Ronald van de Kuil
>Priority: Minor
>
> I have just setup a kerberized Kafa cluster with Ranger 0.7.1 and Kafka 
> 1.0.0. 
> It all seems to work fine as I see that authorisation policies are enforced 
> and auditlogging is present.
> On startup of a kafka server I see a stack trace but it does not seem to 
> matter.
> My wish is to keep the logs tidy and free of false alerts.
> I wonder whether I have an issue somewhere.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6338) java.lang.NoClassDefFoundError: org/apache/kafka/common/network/LoginType

2017-12-08 Thread Ronald van de Kuil (JIRA)
Ronald van de Kuil created KAFKA-6338:
-

 Summary: java.lang.NoClassDefFoundError: 
org/apache/kafka/common/network/LoginType
 Key: KAFKA-6338
 URL: https://issues.apache.org/jira/browse/KAFKA-6338
 Project: Kafka
  Issue Type: Test
Affects Versions: 1.0.0
Reporter: Ronald van de Kuil
Priority: Minor


I have just setup a kerberized Kafa cluster with Ranger 0.7.1 and Kafka 1.0.0. 

It all seems to work fine as I see that authorisation policies are enforced and 
auditlogging is present.

On startup of a kafka server I see a stack trace but it does not seem to matter.

My wish is to keep the logs tidy and free of false alerts.

I wonder whether I have an issue somewhere.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6337) Error for partition [__consumer_offsets,15] to broker

2017-12-08 Thread Abhi (JIRA)
Abhi created KAFKA-6337:
---

 Summary: Error for partition [__consumer_offsets,15] to broker
 Key: KAFKA-6337
 URL: https://issues.apache.org/jira/browse/KAFKA-6337
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.10.2.0
 Environment: Windows running Kafka(0.10.2.0)
3 ZK Instances running on 3 different Windows Servers, 7 Kafka Broker nodes 
running on single windows machine with different disk for logs directory.
Reporter: Abhi


Hello *

I am running Kafka(0.10.2.0) on windows from the past one year ...

But off late there has been unique Broker issues that I have observed 4-5 times 
in
last 4 months.

Kafka setup cofig...

3 ZK Instances running on 3 different Windows Servers, 7 Kafka Broker nodes 
running on single windows machine with different disk for logs directory

My Kafka has 2 Topics with partition size 50 each , and replication factor of 3.

My partition logic selection: Each message has a unique ID and logic of 
selecting partition is ( unique ID % 50), and then calling Kafka producer API 
to route a specific message to a particular topic partition .

My Each Broker Properties look like this

{{broker.id=0
port:9093
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
offsets.retention.minutes=360
advertised.host.name=1.1.1.2
advertised.port:9093
ctories under which to store log files
log.dirs=C:\\kafka_2.10-0.10.2.0-SNAPSHOT\\data\\kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
log.retention.minutes=360
log.segment.bytes=52428800
log.retention.check.interval.ms=30
log.cleaner.enable=true
log.cleanup.policy=delete
log.cleaner.min.cleanable.ratio=0.5
log.cleaner.backoff.ms=15000
log.segment.delete.delay.ms=6000
auto.create.topics.enable=false
zookeeper.connect=1.1.1.2:2181,1.1.1.3:2182,1.1.1.4:2183
zookeeper.connection.timeout.ms=6000
}}
But of-late there has been a unique case that's cropping out in Kafka broker 
nodes,
_[2017-12-02 02:47:40,024] ERROR [ReplicaFetcherThread-0-4], Error for 
partition [__consumer_offsets,15] to broker 
4:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is 
not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread)_

The entire server.log is filled with these logs, and its very huge too , please 
help me in understanding under what circumstances these can occur, and what 
measures I need to take.. 

Please help me this is the third time in last three Saturdays i faced the 
similar issue. 

Courtesy
Abhi
!wq 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4218) Enable access to key in ValueTransformer

2017-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284488#comment-16284488
 ] 

ASF GitHub Bot commented on KAFKA-4218:
---

GitHub user jeyhunkarimov opened a pull request:

https://github.com/apache/kafka/pull/4309

KAFKA-4218: Enable access to key in ValueTransformer and ValueMapper

This PR is the partial implementation for KIP-149. As the discussion for 
this KIP is still ongoing, I made a PR on the "safe" portions of the KIP (so 
that it can be included in the next release) which are 1) `ValueMapperWithKey`, 
2) `ValueTransformerWithKeySupplier`, and 3) `ValueTransformerWithKey`.




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jeyhunkarimov/kafka KIP-149_hope_last

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4309.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4309


commit 91c1108c72f2f3b3097dcff3bd2aad237789215e
Author: Jeyhun Karimov 
Date:   2017-12-09T00:56:36Z

Submit the first version of KIP-149




> Enable access to key in ValueTransformer
> 
>
> Key: KAFKA-4218
> URL: https://issues.apache.org/jira/browse/KAFKA-4218
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.1
>Reporter: Elias Levy
>Assignee: Jeyhun Karimov
>  Labels: api, kip
> Fix For: 1.1.0
>
>
> While transforming values via {{KStream.transformValues}} and 
> {{ValueTransformer}}, the key associated with the value may be needed, even 
> if it is not changed.  For instance, it may be used to access stores.  
> As of now, the key is not available within these methods and interfaces, 
> leading to the use of {{KStream.transform}} and {{Transformer}}, and the 
> unnecessary creation of new {{KeyValue}} objects.
> KIP-149: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-149%3A+Enabling+key+access+in+ValueTransformer%2C+ValueMapper%2C+and+ValueJoiner



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4218) Enable access to key in ValueTransformer

2017-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284486#comment-16284486
 ] 

ASF GitHub Bot commented on KAFKA-4218:
---

Github user jeyhunkarimov closed the pull request at:

https://github.com/apache/kafka/pull/3570


> Enable access to key in ValueTransformer
> 
>
> Key: KAFKA-4218
> URL: https://issues.apache.org/jira/browse/KAFKA-4218
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.1
>Reporter: Elias Levy
>Assignee: Jeyhun Karimov
>  Labels: api, kip
> Fix For: 1.1.0
>
>
> While transforming values via {{KStream.transformValues}} and 
> {{ValueTransformer}}, the key associated with the value may be needed, even 
> if it is not changed.  For instance, it may be used to access stores.  
> As of now, the key is not available within these methods and interfaces, 
> leading to the use of {{KStream.transform}} and {{Transformer}}, and the 
> unnecessary creation of new {{KeyValue}} objects.
> KIP-149: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-149%3A+Enabling+key+access+in+ValueTransformer%2C+ValueMapper%2C+and+ValueJoiner



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6334) Minor documentation typo

2017-12-08 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284480#comment-16284480
 ] 

Guozhang Wang commented on KAFKA-6334:
--

Thanks for the report [~noslowerdna], mind submitting a patch for it? 
https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Website+Documentation+Changes

> Minor documentation typo
> 
>
> Key: KAFKA-6334
> URL: https://issues.apache.org/jira/browse/KAFKA-6334
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 1.0.0
>Reporter: Andrew Olson
>Priority: Trivial
>
> At [1]:
> {quote}
> 0.11.0 consumers support backwards compatibility with brokers 0.10.0 brokers 
> and upward, so it is possible to upgrade the clients first before the brokers
> {quote}
> Specifically the "brokers 0.10.0 brokers" wording.
> [1] http://kafka.apache.org/documentation.html#upgrade_11_message_format



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6336) when using assign() with kafka consumer the KafkaConsumerGroup command doesnt show those consumers

2017-12-08 Thread Neerja Khattar (JIRA)
Neerja Khattar created KAFKA-6336:
-

 Summary: when using assign() with kafka consumer the 
KafkaConsumerGroup command doesnt show those consumers
 Key: KAFKA-6336
 URL: https://issues.apache.org/jira/browse/KAFKA-6336
 Project: Kafka
  Issue Type: Bug
Reporter: Neerja Khattar


The issue is when using assign rather than subscribe for kafka consumers commit 
not able to get the lag using ConsumerGroup command. It doesnt even list those 
groups.

JMX tool also doesnt show lag properly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6335) SimpleAclAuthorizerTest#testHighConcurrencyModificationOfResourceAcls fails intermittently

2017-12-08 Thread Ted Yu (JIRA)
Ted Yu created KAFKA-6335:
-

 Summary: 
SimpleAclAuthorizerTest#testHighConcurrencyModificationOfResourceAcls fails 
intermittently
 Key: KAFKA-6335
 URL: https://issues.apache.org/jira/browse/KAFKA-6335
 Project: Kafka
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor


>From 
>https://builds.apache.org/job/kafka-pr-jdk9-scala2.12/3045/testReport/junit/kafka.security.auth/SimpleAclAuthorizerTest/testHighConcurrencyModificationOfResourceAcls/
> :
{code}
java.lang.AssertionError: expected acls Set(User:36 has Allow permission for 
operations: Read from hosts: *, User:7 has Allow permission for operations: 
Read from hosts: *, User:21 has Allow permission for operations: Read from 
hosts: *, User:39 has Allow permission for operations: Read from hosts: *, 
User:43 has Allow permission for operations: Read from hosts: *, User:3 has 
Allow permission for operations: Read from hosts: *, User:35 has Allow 
permission for operations: Read from hosts: *, User:15 has Allow permission for 
operations: Read from hosts: *, User:16 has Allow permission for operations: 
Read from hosts: *, User:22 has Allow permission for operations: Read from 
hosts: *, User:26 has Allow permission for operations: Read from hosts: *, 
User:11 has Allow permission for operations: Read from hosts: *, User:38 has 
Allow permission for operations: Read from hosts: *, User:8 has Allow 
permission for operations: Read from hosts: *, User:28 has Allow permission for 
operations: Read from hosts: *, User:32 has Allow permission for operations: 
Read from hosts: *, User:25 has Allow permission for operations: Read from 
hosts: *, User:41 has Allow permission for operations: Read from hosts: *, 
User:44 has Allow permission for operations: Read from hosts: *, User:48 has 
Allow permission for operations: Read from hosts: *, User:2 has Allow 
permission for operations: Read from hosts: *, User:9 has Allow permission for 
operations: Read from hosts: *, User:14 has Allow permission for operations: 
Read from hosts: *, User:46 has Allow permission for operations: Read from 
hosts: *, User:13 has Allow permission for operations: Read from hosts: *, 
User:5 has Allow permission for operations: Read from hosts: *, User:29 has 
Allow permission for operations: Read from hosts: *, User:45 has Allow 
permission for operations: Read from hosts: *, User:6 has Allow permission for 
operations: Read from hosts: *, User:37 has Allow permission for operations: 
Read from hosts: *, User:23 has Allow permission for operations: Read from 
hosts: *, User:19 has Allow permission for operations: Read from hosts: *, 
User:24 has Allow permission for operations: Read from hosts: *, User:17 has 
Allow permission for operations: Read from hosts: *, User:34 has Allow 
permission for operations: Read from hosts: *, User:12 has Allow permission for 
operations: Read from hosts: *, User:42 has Allow permission for operations: 
Read from hosts: *, User:4 has Allow permission for operations: Read from 
hosts: *, User:47 has Allow permission for operations: Read from hosts: *, 
User:18 has Allow permission for operations: Read from hosts: *, User:31 has 
Allow permission for operations: Read from hosts: *, User:49 has Allow 
permission for operations: Read from hosts: *, User:33 has Allow permission for 
operations: Read from hosts: *, User:1 has Allow permission for operations: 
Read from hosts: *, User:27 has Allow permission for operations: Read from 
hosts: *) but got Set(User:36 has Allow permission for operations: Read from 
hosts: *, User:7 has Allow permission for operations: Read from hosts: *, 
User:21 has Allow permission for operations: Read from hosts: *, User:39 has 
Allow permission for operations: Read from hosts: *, User:43 has Allow 
permission for operations: Read from hosts: *, User:3 has Allow permission for 
operations: Read from hosts: *, User:35 has Allow permission for operations: 
Read from hosts: *, User:15 has Allow permission for operations: Read from 
hosts: *, User:16 has Allow permission for operations: Read from hosts: *, 
User:22 has Allow permission for operations: Read from hosts: *, User:26 has 
Allow permission for operations: Read from hosts: *, User:11 has Allow 
permission for operations: Read from hosts: *, User:38 has Allow permission for 
operations: Read from hosts: *, User:8 has Allow permission for operations: 
Read from hosts: *, User:28 has Allow permission for operations: Read from 
hosts: *, User:32 has Allow permission for operations: Read from hosts: *, 
User:25 has Allow permission for operations: Read from hosts: *, User:41 has 
Allow permission for operations: Read from hosts: *, User:44 has Allow 
permission for operations: Read from hosts: *, User:48 has Allow permission for 
operations: Read from hosts: *, User:2 has Allow permission for operations: 
Read from hosts: *, User:9 has Allow permission for 

[jira] [Commented] (KAFKA-6323) punctuate with WALL_CLOCK_TIME triggered immediately

2017-12-08 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284417#comment-16284417
 ] 

Matthias J. Sax commented on KAFKA-6323:


We discussed "mix mode" during the KIP 
(https://cwiki.apache.org/confluence/display/KAFKA/KIP-138%3A+Change+punctuate+semantics)
 discussion -- we buried this idea. Users can build this by themselves (note, 
you can register as many independent punctuation schedules for a singe 
processes as you want -- and can also mix wall-clock and stream-time)

I also second Guozhang's proposal!

> punctuate with WALL_CLOCK_TIME triggered immediately
> 
>
> Key: KAFKA-6323
> URL: https://issues.apache.org/jira/browse/KAFKA-6323
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Frederic Arno
>Assignee: Frederic Arno
> Fix For: 1.1.0, 1.0.1
>
>
> When working on a custom Processor from which I am scheduling a punctuation 
> using WALL_CLOCK_TIME. I've noticed that whatever the punctuation interval I 
> set, a call to my Punctuator is always triggered immediately.
> Having a quick look at kafka-streams' code, I could find that all 
> PunctuationSchedule's timestamps are matched against the current time in 
> order to decide whether or not to trigger the punctuator 
> (org.apache.kafka.streams.processor.internals.PunctuationQueue#mayPunctuate). 
> However, I've only seen code that initializes PunctuationSchedule's timestamp 
> to 0, which I guess is what is causing an immediate punctuation.
> At least when using WALL_CLOCK_TIME, shouldn't the PunctuationSchedule's 
> timestamp be initialized to current time + interval?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6323) punctuate with WALL_CLOCK_TIME triggered immediately

2017-12-08 Thread Stephane Maarek (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284366#comment-16284366
 ] 

Stephane Maarek commented on KAFKA-6323:


Fully agree [~guozhang] . I fully agree on the punctuate once as well (even if 
T2 is 5 intervals away), I have observed punctuate being called way to many 
times if the data does a big jump. 

Finally, is there any interest or use cases in using both a wall clock and 
event driven punctuate? Might require a KIP for that one

> punctuate with WALL_CLOCK_TIME triggered immediately
> 
>
> Key: KAFKA-6323
> URL: https://issues.apache.org/jira/browse/KAFKA-6323
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Frederic Arno
>Assignee: Frederic Arno
> Fix For: 1.1.0, 1.0.1
>
>
> When working on a custom Processor from which I am scheduling a punctuation 
> using WALL_CLOCK_TIME. I've noticed that whatever the punctuation interval I 
> set, a call to my Punctuator is always triggered immediately.
> Having a quick look at kafka-streams' code, I could find that all 
> PunctuationSchedule's timestamps are matched against the current time in 
> order to decide whether or not to trigger the punctuator 
> (org.apache.kafka.streams.processor.internals.PunctuationQueue#mayPunctuate). 
> However, I've only seen code that initializes PunctuationSchedule's timestamp 
> to 0, which I guess is what is causing an immediate punctuation.
> At least when using WALL_CLOCK_TIME, shouldn't the PunctuationSchedule's 
> timestamp be initialized to current time + interval?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6334) Minor documentation typo

2017-12-08 Thread Andrew Olson (JIRA)
Andrew Olson created KAFKA-6334:
---

 Summary: Minor documentation typo
 Key: KAFKA-6334
 URL: https://issues.apache.org/jira/browse/KAFKA-6334
 Project: Kafka
  Issue Type: Bug
  Components: documentation
Affects Versions: 1.0.0
Reporter: Andrew Olson
Priority: Trivial


At [1]:

{quote}
0.11.0 consumers support backwards compatibility with brokers 0.10.0 brokers 
and upward, so it is possible to upgrade the clients first before the brokers
{quote}

Specifically the "brokers 0.10.0 brokers" wording.

[1] http://kafka.apache.org/documentation.html#upgrade_11_message_format



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6333) java.awt.headless should not be on commandline

2017-12-08 Thread Fabrice Bacchella (JIRA)
Fabrice Bacchella created KAFKA-6333:


 Summary: java.awt.headless should not be on commandline
 Key: KAFKA-6333
 URL: https://issues.apache.org/jira/browse/KAFKA-6333
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 1.0.0
Reporter: Fabrice Bacchella
Priority: Trivial


The option -Djava.awt.headless=true is defined in KAFKA_JVM_PERFORMANCE_OPTS.

But it should even not be present on command line. It's only useful for 
application that can be used in application that is used in both a headless and 
a traditional environment. Kafka is a server, so it should be setup in the main 
class. This help reduce clutter in command line.

See http://www.oracle.com/technetwork/articles/javase/headless-136834.html for 
more details.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6307) mBeanName should be removed before returning from JmxReporter#removeAttribute()

2017-12-08 Thread siva santhalingam (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284135#comment-16284135
 ] 

siva santhalingam commented on KAFKA-6307:
--

[~tedyu] Please go ahead and assign it to yourself.

> mBeanName should be removed before returning from 
> JmxReporter#removeAttribute()
> ---
>
> Key: KAFKA-6307
> URL: https://issues.apache.org/jira/browse/KAFKA-6307
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: siva santhalingam
>
> JmxReporter$KafkaMbean showed up near the top in the first histo output from 
> KAFKA-6199.
> In JmxReporter#removeAttribute() :
> {code}
> KafkaMbean mbean = this.mbeans.get(mBeanName);
> if (mbean != null)
> mbean.removeAttribute(metricName.name());
> return mbean;
> {code}
> mbeans.remove(mBeanName) should be called before returning.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6318) StreamsResetter should return non-zero return code on error

2017-12-08 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284083#comment-16284083
 ] 

Matthias J. Sax commented on KAFKA-6318:


Of course you can assign tickets to yourself if it's unassigned.

> StreamsResetter should return non-zero return code on error
> ---
>
> Key: KAFKA-6318
> URL: https://issues.apache.org/jira/browse/KAFKA-6318
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, tools
>Affects Versions: 1.0.0
>Reporter: Matthias J. Sax
>Assignee: siva santhalingam
>
> If users specify a non-existing topic as input parameter,  
> {{StreamsResetter}} only prints out an error message that the topic was not 
> found, but return code is still zero. We should return a non-zero return code 
> for this case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-6318) StreamsResetter should return non-zero return code on error

2017-12-08 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax reassigned KAFKA-6318:
--

Assignee: siva santhalingam

> StreamsResetter should return non-zero return code on error
> ---
>
> Key: KAFKA-6318
> URL: https://issues.apache.org/jira/browse/KAFKA-6318
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, tools
>Affects Versions: 1.0.0
>Reporter: Matthias J. Sax
>Assignee: siva santhalingam
>
> If users specify a non-existing topic as input parameter,  
> {{StreamsResetter}} only prints out an error message that the topic was not 
> found, but return code is still zero. We should return a non-zero return code 
> for this case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6307) mBeanName should be removed before returning from JmxReporter#removeAttribute()

2017-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284053#comment-16284053
 ] 

ASF GitHub Bot commented on KAFKA-6307:
---

GitHub user tedyu opened a pull request:

https://github.com/apache/kafka/pull/4307

KAFKA-6307 mBeanName should be removed before returning from 
JmxReporter#removeAttribute()


### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation 
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tedyu/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4307.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4307


commit 82d3f60eaf0548b478c7d873c5fbf4b930a11473
Author: tedyu 
Date:   2017-12-08T19:08:30Z

KAFKA-6307 mBeanName should be removed before returning from 
JmxReporter#removeAttribute()




> mBeanName should be removed before returning from 
> JmxReporter#removeAttribute()
> ---
>
> Key: KAFKA-6307
> URL: https://issues.apache.org/jira/browse/KAFKA-6307
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: siva santhalingam
>
> JmxReporter$KafkaMbean showed up near the top in the first histo output from 
> KAFKA-6199.
> In JmxReporter#removeAttribute() :
> {code}
> KafkaMbean mbean = this.mbeans.get(mBeanName);
> if (mbean != null)
> mbean.removeAttribute(metricName.name());
> return mbean;
> {code}
> mbeans.remove(mBeanName) should be called before returning.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6331) Transient failure in kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs

2017-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283995#comment-16283995
 ] 

ASF GitHub Bot commented on KAFKA-6331:
---

GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/4306

KAFKA-6331; Fix transient failure in 
AdminClientIntegrationTest.testAlterReplicaLogDirs

*More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.*

*Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.*

### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation 
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-6331

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4306.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4306


commit f5c364e774a7a3b75fc49f8f5eb6bb707d1e799d
Author: Dong Lin 
Date:   2017-12-08T18:34:16Z

KAFKA-6331; Fix transient failure in 
AdminClientIntegrationTest.testAlterReplicaLogDirs




> Transient failure in 
> kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs
> --
>
> Key: KAFKA-6331
> URL: https://issues.apache.org/jira/browse/KAFKA-6331
> Project: Kafka
>  Issue Type: Bug
>  Components: unit tests
>Reporter: Guozhang Wang
>
> Saw this error once on Jenkins: 
> https://builds.apache.org/job/kafka-pr-jdk9-scala2.12/3025/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/
> {code}
> Stacktrace
> java.lang.AssertionError: timed out waiting for message produce
>   at kafka.utils.TestUtils$.fail(TestUtils.scala:347)
>   at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:861)
>   at 
> kafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs(AdminClientIntegrationTest.scala:357)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:844)
> Standard Output
> [2017-12-07 19:22:56,297] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:22:59,447] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:22:59,453] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:23:01,335] ERROR Error while creating ephemeral at 
> /controller, node already exists and owner '99134641238966279' does not match 
> current session '99134641238966277' 
> (kafka.zk.KafkaZkClient$CheckedEphemeral:71)
> [2017-12-07 19:23:04,695] ERROR ZKShutdownHandler is not 

[jira] [Commented] (KAFKA-6318) StreamsResetter should return non-zero return code on error

2017-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283954#comment-16283954
 ] 

ASF GitHub Bot commented on KAFKA-6318:
---

GitHub user shivsantham opened a pull request:

https://github.com/apache/kafka/pull/4305

KAFKA-6318: StreamsResetter should return non-zero return code on error

If users specify a non-existing topic as input parameter, StreamsResetter 
only prints out an error message that the topic was not found, but return code 
is still zero. We should return a non-zero return code for this case.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shivsantham/kafka KAFKA-6318

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4305.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4305


commit a9696d03bd6f9e03071b3d76b3ad51a411766557
Author: siva santhalingam 
Date:   2017-12-08T18:05:55Z

KAFKA-6318: StreamsResetter should return non-zero return code on error




> StreamsResetter should return non-zero return code on error
> ---
>
> Key: KAFKA-6318
> URL: https://issues.apache.org/jira/browse/KAFKA-6318
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, tools
>Affects Versions: 1.0.0
>Reporter: Matthias J. Sax
>
> If users specify a non-existing topic as input parameter,  
> {{StreamsResetter}} only prints out an error message that the topic was not 
> found, but return code is still zero. We should return a non-zero return code 
> for this case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6307) mBeanName should be removed before returning from JmxReporter#removeAttribute()

2017-12-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283925#comment-16283925
 ] 

Ted Yu commented on KAFKA-6307:
---

Siva:
I have some cycles now.
If you haven't started, can I continue working on this ?

> mBeanName should be removed before returning from 
> JmxReporter#removeAttribute()
> ---
>
> Key: KAFKA-6307
> URL: https://issues.apache.org/jira/browse/KAFKA-6307
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: siva santhalingam
>
> JmxReporter$KafkaMbean showed up near the top in the first histo output from 
> KAFKA-6199.
> In JmxReporter#removeAttribute() :
> {code}
> KafkaMbean mbean = this.mbeans.get(mBeanName);
> if (mbean != null)
> mbean.removeAttribute(metricName.name());
> return mbean;
> {code}
> mbeans.remove(mBeanName) should be called before returning.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6323) punctuate with WALL_CLOCK_TIME triggered immediately

2017-12-08 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283901#comment-16283901
 ] 

Guozhang Wang commented on KAFKA-6323:
--

Here are my thoughts on punctuation semantics (on both KAFKA-6323 and 
KAFKA-6092):

*First trigger*: we should only punctuate the first time after the specified 
period has elapsed. And here is a slight difference with wall-clock time and 
stream time:

1. WALL_CLOCK_TIME: when the stream application starts at t0 (system wall clock 
time), punctuate first-time on t0 + t_scheduled.
2. STREAM_TIME: when the stream application starts, we do no schedule the first 
punctuation until the stream time is known (i.e. we have received at least one 
record from each input topic), say it is T01, punctuate first-time on T0 + 
T_scheduled.

*Interval*: again I think there is a slight difference with wall-clock time and 
stream time:

1. WALL_CLOCK_TIME: when the stream application last punctuation at t1, 
punctuate next-time on t1 + t_scheduled, even if there is no data arrived 
during this period of time.
2. STREAM_TIME: this is data driven, and hence: when the stream application 
last punctuation at T1, and then stream time is updated and advanced to T2, 
where (T2 - T1) > t_scheduled, punctuate at T2 once even if (T2 - T1) >= 
t_scheduled * 2.

WDYT? cc [~stephane.maa...@gmail.com] [~mih...@wp.pl]

> punctuate with WALL_CLOCK_TIME triggered immediately
> 
>
> Key: KAFKA-6323
> URL: https://issues.apache.org/jira/browse/KAFKA-6323
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Frederic Arno
>Assignee: Frederic Arno
> Fix For: 1.1.0, 1.0.1
>
>
> When working on a custom Processor from which I am scheduling a punctuation 
> using WALL_CLOCK_TIME. I've noticed that whatever the punctuation interval I 
> set, a call to my Punctuator is always triggered immediately.
> Having a quick look at kafka-streams' code, I could find that all 
> PunctuationSchedule's timestamps are matched against the current time in 
> order to decide whether or not to trigger the punctuator 
> (org.apache.kafka.streams.processor.internals.PunctuationQueue#mayPunctuate). 
> However, I've only seen code that initializes PunctuationSchedule's timestamp 
> to 0, which I guess is what is causing an immediate punctuation.
> At least when using WALL_CLOCK_TIME, shouldn't the PunctuationSchedule's 
> timestamp be initialized to current time + interval?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6323) punctuate with WALL_CLOCK_TIME triggered immediately

2017-12-08 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283863#comment-16283863
 ] 

Guozhang Wang commented on KAFKA-6323:
--

[~frederica] I have added you to the contributor list. You can assign tickets 
to yourself now.

> punctuate with WALL_CLOCK_TIME triggered immediately
> 
>
> Key: KAFKA-6323
> URL: https://issues.apache.org/jira/browse/KAFKA-6323
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Frederic Arno
>Assignee: Frederic Arno
> Fix For: 1.1.0, 1.0.1
>
>
> When working on a custom Processor from which I am scheduling a punctuation 
> using WALL_CLOCK_TIME. I've noticed that whatever the punctuation interval I 
> set, a call to my Punctuator is always triggered immediately.
> Having a quick look at kafka-streams' code, I could find that all 
> PunctuationSchedule's timestamps are matched against the current time in 
> order to decide whether or not to trigger the punctuator 
> (org.apache.kafka.streams.processor.internals.PunctuationQueue#mayPunctuate). 
> However, I've only seen code that initializes PunctuationSchedule's timestamp 
> to 0, which I guess is what is causing an immediate punctuation.
> At least when using WALL_CLOCK_TIME, shouldn't the PunctuationSchedule's 
> timestamp be initialized to current time + interval?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-6323) punctuate with WALL_CLOCK_TIME triggered immediately

2017-12-08 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang reassigned KAFKA-6323:


Assignee: Frederic Arno

> punctuate with WALL_CLOCK_TIME triggered immediately
> 
>
> Key: KAFKA-6323
> URL: https://issues.apache.org/jira/browse/KAFKA-6323
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Frederic Arno
>Assignee: Frederic Arno
> Fix For: 1.1.0, 1.0.1
>
>
> When working on a custom Processor from which I am scheduling a punctuation 
> using WALL_CLOCK_TIME. I've noticed that whatever the punctuation interval I 
> set, a call to my Punctuator is always triggered immediately.
> Having a quick look at kafka-streams' code, I could find that all 
> PunctuationSchedule's timestamps are matched against the current time in 
> order to decide whether or not to trigger the punctuator 
> (org.apache.kafka.streams.processor.internals.PunctuationQueue#mayPunctuate). 
> However, I've only seen code that initializes PunctuationSchedule's timestamp 
> to 0, which I guess is what is causing an immediate punctuation.
> At least when using WALL_CLOCK_TIME, shouldn't the PunctuationSchedule's 
> timestamp be initialized to current time + interval?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5002) Stream does't seem to consider partitions for processing which are being consumed

2017-12-08 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-5002.
--
Resolution: Not A Problem

> Stream does't seem to consider partitions for processing which are being 
> consumed
> -
>
> Key: KAFKA-5002
> URL: https://issues.apache.org/jira/browse/KAFKA-5002
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
> Environment: Windows 8.1
>Reporter: Mustak
>  Labels: patch
>
> Kafka streams doesn't seems to consider particular partition for processing 
> if that partition is being consumed by some consumer. For example if I've two 
> topics t1 and t2 with two partitions p1 and p2 and there is a stream process 
> is running with consumes data from these topics and produce output to topic 
> t3 which has two partitions. If run this kind of topology it works but if i 
> start consumer which consumes data from topic t1 and partition p1 then the 
> stream logic doesn't consider p1 for processing and stream doesn't provide 
> any output related to that partition. I think stream logic should consider 
> partitions which are being consumed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5868) Kafka Consumer Rebalancing takes too long

2017-12-08 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283857#comment-16283857
 ] 

Guozhang Wang commented on KAFKA-5868:
--

[~nandishkotadia] Have you tried the newer version {{1.0.0}} and see if this 
issue goes away? Note that you can code your app with 1.0.0 client that talks 
to older versioned brokers.

> Kafka Consumer Rebalancing takes too long
> -
>
> Key: KAFKA-5868
> URL: https://issues.apache.org/jira/browse/KAFKA-5868
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0, 0.10.2.1, 0.11.0.0
>Reporter: Nandish Kotadia
>
> up vote
> 0
> down vote
> favorite
> 1
> I have a Kafka Streams Application which takes data from few topics and joins 
> the data and puts it in another topic.
> *Kafka Configuration: *
> * 5 kafka brokers
> * Kafka Topics - 15 partitions and 3 replication factor.
> Few millions of records are consumed/produced every hour. Whenever I take any 
> kafka broker down, it goes into rebalancing and it takes approx. 30 minutes 
> or sometimes even more for rebalancing. Also it kills many of my Kafka 
> Streams processes.
> *Note: My Kafka Streams processes are running on the same machine as of Kafka 
> Broker.*
> Anyone has any idea how to solve rebalancing issue in kafka consumer? Also, 
> many times it throws exception while rebalancing.
> This is stopping us from going live in production environment with this 
> setup. Any help would be appreciated.
> _Caused by: org.apache.kafka.clients.consumer.CommitFailedException: ?
> Commit cannot be completed since the group has already rebalanced and 
> assigned the partitions to another member. This means that the time between 
> subsequent calls to poll() was longer than the configured 
> max.poll.interval.ms, which typically implies that the poll loop is spending 
> too much time message processing. You can address this either by increasing 
> the session timeout or by reducing the maximum size of batches returned in 
> poll() with max.poll.records.
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest(ConsumerCoordinator.java:725)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:604)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1173)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commitOffsets(StreamTask.java:307)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.access$000(StreamTask.java:49)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:268)
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:187)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commitImpl(StreamTask.java:259)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.suspend(StreamTask.java:362)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.suspend(StreamTask.java:346)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$3.apply(StreamThread.java:1118)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.performOnStreamTasks(StreamThread.java:1448)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.suspendTasksAndState(StreamThread.java:1110)_
> *Kafka Streams Config: *
> * 
> bootstrap.servers=kafka-1:9092,kafka-2:9092,kafka-3:9092,kafka-4:9092,kafka-5:9092
> * max.poll.records = 100
> * request.timeout.ms=4
> ConsumerConfig it internally creates is:
> auto.commit.interval.ms = 5000
> auto.offset.reset = earliest
> bootstrap.servers = [kafka-1:9092, kafka-2:9092, kafka-3:9092, 
> kafka-4:9092, kafka-5:9092]
> check.crcs = true
> client.id = conversion-live-StreamThread-1-restore-consumer
> connections.max.idle.ms = 54
> enable.auto.commit = false
> exclude.internal.topics = true
> fetch.max.bytes = 52428800
> fetch.max.wait.ms = 500
> fetch.min.bytes = 1
> group.id = 
> heartbeat.interval.ms = 3000
> interceptor.classes = null
> internal.leave.group.on.close = false
> isolation.level = read_uncommitted
> key.deserializer = class 
> org.apache.kafka.common.serialization.ByteArrayDeserializer
> max.partition.fetch.bytes = 1048576
> max.poll.interval.ms = 2147483647
> max.poll.records = 100
> metadata.max.age.ms = 30
> metric.reporters = []
> metrics.num.samples = 2
> metrics.recording.level = INFO
> metrics.sample.window.ms = 3
> partition.assignment.strategy = [class 
> org.apache.kafka.clients.consumer.RangeAssignor]
> receive.buffer.bytes = 65536
> reconnect.backoff.max.ms = 1000
> reconnect.backoff.ms = 50
> 

[jira] [Commented] (KAFKA-6237) stream stopped working after exception: Cannot execute transactional method because we are in an error state

2017-12-08 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283845#comment-16283845
 ] 

Guozhang Wang commented on KAFKA-6237:
--

There is no broker issue I think, note that {{TransactionManager}} code is from 
the producer client side.

> stream stopped working after exception: Cannot execute transactional method 
> because we are in an error state
> 
>
> Key: KAFKA-6237
> URL: https://issues.apache.org/jira/browse/KAFKA-6237
> Project: Kafka
>  Issue Type: Bug
>  Components: core, streams
>Reporter: DHRUV BANSAL
>Priority: Critical
>  Labels: exactly-once
> Attachments: nohup.out
>
>
> 017-11-19 07:52:44,673 
> [project_logs_stream-a30ea242-3c9f-46a9-a01c-51903bd40ca5-StreamThread-1] 
> ERROR: org.apache.kafka.streams.processor.internals.StreamThread - 
> stream-thread 
> [orion_logs_stream-a30ea242-3c9f-46a9-a01c-51903bd40ca5-StreamThread-1] 
> Failed while closing StreamTask 0_1:
> org.apache.kafka.common.KafkaException: Cannot execute transactional method 
> because we are in an error state
>   at 
> org.apache.kafka.clients.producer.internals.TransactionManager.maybeFailWithError(TransactionManager.java:524)
>   at 
> org.apache.kafka.clients.producer.internals.TransactionManager.beginAbort(TransactionManager.java:198)
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.abortTransaction(KafkaProducer.java:598)
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.close(StreamTask.java:434)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.shutdownTasksAndState(StreamThread.java:1086)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.shutdown(StreamThread.java:1041)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:538)
> Caused by: org.apache.kafka.common.KafkaException: Unexpected error in 
> AddOffsetsToTxnResponse: The server experienced an unexpected error when 
> processing the request
>   at 
> org.apache.kafka.clients.producer.internals.TransactionManager$AddOffsetsToTxnHandler.handleResponse(TransactionManager.java:978)
>   at 
> org.apache.kafka.clients.producer.internals.TransactionManager$TxnRequestHandler.onComplete(TransactionManager.java:648)
>   at 
> org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:101)
>   at 
> org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:454)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:446)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:206)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:162)
>   at java.lang.Thread.run(Thread.java:745)
> Also when I see the state of the corresponding consumer group it is saying:
> +Warning: Consumer group  is rebalancing.+



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5702) Refactor StreamThread to separate concerns and enable better testability

2017-12-08 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-5702.
--
Resolution: Fixed

> Refactor StreamThread to separate concerns and enable better testability
> 
>
> Key: KAFKA-5702
> URL: https://issues.apache.org/jira/browse/KAFKA-5702
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Damian Guy
>Assignee: Damian Guy
>
> {{StreamThread}} does a lot of stuff, i.e., managing and creating tasks, 
> getting data from consumers, updating standby tasks, punctuating, rebalancing 
> etc. With the current design it is extremely hard to reason about and is 
> quite tightly coupled. 
> We need to start to tease out some of the separate concerns from 
> StreamThread, ie, TaskManager, RebalanceListener etc. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4706) Unify StreamsKafkaClient instances

2017-12-08 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283819#comment-16283819
 ] 

Guozhang Wang commented on KAFKA-4706:
--

[~sharad.develop] StreamsKafkaClient has been replaced with the AdminClient. 
I'm closing this ticket now.

If you'd like to contribute to other tickets, please take a look at other 
newbie labeled tasks?

> Unify StreamsKafkaClient instances
> --
>
> Key: KAFKA-4706
> URL: https://issues.apache.org/jira/browse/KAFKA-4706
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Matthias J. Sax
>Assignee: Sharad
>Priority: Minor
>  Labels: beginner, easyfix, newbie
>
> Kafka Streams currently used two instances of {{StreamsKafkaClient}} (one in 
> {{KafkaStreams}} and one in {{InternalTopicManager}}).
> We want to unify both such that only a single instance is used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6331) Transient failure in kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs

2017-12-08 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283786#comment-16283786
 ] 

Ismael Juma commented on KAFKA-6331:


cc [~lindong]

> Transient failure in 
> kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs
> --
>
> Key: KAFKA-6331
> URL: https://issues.apache.org/jira/browse/KAFKA-6331
> Project: Kafka
>  Issue Type: Bug
>  Components: unit tests
>Reporter: Guozhang Wang
>
> Saw this error once on Jenkins: 
> https://builds.apache.org/job/kafka-pr-jdk9-scala2.12/3025/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/
> {code}
> Stacktrace
> java.lang.AssertionError: timed out waiting for message produce
>   at kafka.utils.TestUtils$.fail(TestUtils.scala:347)
>   at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:861)
>   at 
> kafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs(AdminClientIntegrationTest.scala:357)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:844)
> Standard Output
> [2017-12-07 19:22:56,297] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:22:59,447] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:22:59,453] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:23:01,335] ERROR Error while creating ephemeral at 
> /controller, node already exists and owner '99134641238966279' does not match 
> current session '99134641238966277' 
> (kafka.zk.KafkaZkClient$CheckedEphemeral:71)
> [2017-12-07 19:23:04,695] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:23:04,760] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:23:06,764] ERROR Error while creating ephemeral at 
> /controller, node already exists and owner '99134641586700293' does not match 
> current session '99134641586700295' 
> (kafka.zk.KafkaZkClient$CheckedEphemeral:71)
> [2017-12-07 19:23:09,379] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:23:09,387] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:23:11,533] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes (org.apache.zookeeper.server.ZooKeeperServer:472)
> [2017-12-07 19:23:11,539] ERROR ZKShutdownHandler is not registered, so 
> ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
> changes 

[jira] [Created] (KAFKA-6332) Kafka system tests should use nc instead of log grep to detect start-up

2017-12-08 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6332:
--

 Summary: Kafka system tests should use nc instead of log grep to 
detect start-up
 Key: KAFKA-6332
 URL: https://issues.apache.org/jira/browse/KAFKA-6332
 Project: Kafka
  Issue Type: Bug
Reporter: Ismael Juma


[~ewencp] suggested using nc -z test instead of grepping the logs for a more 
reliable test. This came up when the system tests were broken by a log 
improvement change.

Reference: https://github.com/apache/kafka/pull/3834



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6331) Transient failure in kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs

2017-12-08 Thread Guozhang Wang (JIRA)
Guozhang Wang created KAFKA-6331:


 Summary: Transient failure in 
kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs
 Key: KAFKA-6331
 URL: https://issues.apache.org/jira/browse/KAFKA-6331
 Project: Kafka
  Issue Type: Bug
  Components: unit tests
Reporter: Guozhang Wang


Saw this error once on Jenkins: 
https://builds.apache.org/job/kafka-pr-jdk9-scala2.12/3025/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/

{code}
Stacktrace

java.lang.AssertionError: timed out waiting for message produce
at kafka.utils.TestUtils$.fail(TestUtils.scala:347)
at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:861)
at 
kafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs(AdminClientIntegrationTest.scala:357)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.lang.Thread.run(Thread.java:844)
Standard Output

[2017-12-07 19:22:56,297] ERROR ZKShutdownHandler is not registered, so 
ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
changes (org.apache.zookeeper.server.ZooKeeperServer:472)
[2017-12-07 19:22:59,447] ERROR ZKShutdownHandler is not registered, so 
ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
changes (org.apache.zookeeper.server.ZooKeeperServer:472)
[2017-12-07 19:22:59,453] ERROR ZKShutdownHandler is not registered, so 
ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
changes (org.apache.zookeeper.server.ZooKeeperServer:472)
[2017-12-07 19:23:01,335] ERROR Error while creating ephemeral at /controller, 
node already exists and owner '99134641238966279' does not match current 
session '99134641238966277' (kafka.zk.KafkaZkClient$CheckedEphemeral:71)
[2017-12-07 19:23:04,695] ERROR ZKShutdownHandler is not registered, so 
ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
changes (org.apache.zookeeper.server.ZooKeeperServer:472)
[2017-12-07 19:23:04,760] ERROR ZKShutdownHandler is not registered, so 
ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
changes (org.apache.zookeeper.server.ZooKeeperServer:472)
[2017-12-07 19:23:06,764] ERROR Error while creating ephemeral at /controller, 
node already exists and owner '99134641586700293' does not match current 
session '99134641586700295' (kafka.zk.KafkaZkClient$CheckedEphemeral:71)
[2017-12-07 19:23:09,379] ERROR ZKShutdownHandler is not registered, so 
ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
changes (org.apache.zookeeper.server.ZooKeeperServer:472)
[2017-12-07 19:23:09,387] ERROR ZKShutdownHandler is not registered, so 
ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
changes (org.apache.zookeeper.server.ZooKeeperServer:472)
[2017-12-07 19:23:11,533] ERROR ZKShutdownHandler is not registered, so 
ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
changes (org.apache.zookeeper.server.ZooKeeperServer:472)
[2017-12-07 19:23:11,539] ERROR ZKShutdownHandler is not registered, so 
ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
changes (org.apache.zookeeper.server.ZooKeeperServer:472)
[2017-12-07 19:23:13,022] ERROR Error while creating ephemeral at /controller, 
node already exists and owner '99134642031034375' does not match current 
session '99134642031034373' (kafka.zk.KafkaZkClient$CheckedEphemeral:71)
[2017-12-07 19:23:14,667] ERROR ZKShutdownHandler is not registered, so 
ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
changes 

[jira] [Updated] (KAFKA-6330) KafkaZkClient request queue time metric

2017-12-08 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-6330:
---
Issue Type: Sub-task  (was: Bug)
Parent: KAFKA-5027

> KafkaZkClient request queue time metric
> ---
>
> Key: KAFKA-6330
> URL: https://issues.apache.org/jira/browse/KAFKA-6330
> Project: Kafka
>  Issue Type: Sub-task
>  Components: zkclient
>Reporter: Ismael Juma
>  Labels: needs-kip
>
> KafkaZkClient have a latency metric which is the time it takes to send a 
> request and receive the corresponding response.
> If ZooKeeperClient's `maxInFlightRequests` (10 by default) is reached, a 
> request may be held for some time before sending starts. This time is not 
> currently measured and it may be useful to know if requests are spending 
> longer than usual in the `queue` (conceptually as the current implementation 
> doesn't use a queue).
> This would require a KIP.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-6320) move ZK metrics in KafkaHealthCheck to ZookeeperClient

2017-12-08 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-6320:
---
Fix Version/s: 1.1.0

> move ZK metrics in KafkaHealthCheck to ZookeeperClient
> --
>
> Key: KAFKA-6320
> URL: https://issues.apache.org/jira/browse/KAFKA-6320
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 1.0.0
>Reporter: Jun Rao
> Fix For: 1.1.0
>
>
> In KAFKA-5473, we will be de-commissioning the usage of KafkaHealthCheck. So, 
> we need to move the ZK metrics SessionState and ZooKeeper${eventType}PerSec 
> in that class to somewhere else (e.g. ZookeeperClient).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6330) KafkaZkClient request queue time metric

2017-12-08 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6330:
--

 Summary: KafkaZkClient request queue time metric
 Key: KAFKA-6330
 URL: https://issues.apache.org/jira/browse/KAFKA-6330
 Project: Kafka
  Issue Type: Bug
  Components: zkclient
Reporter: Ismael Juma


KafkaZkClient have a latency metric which is the time it takes to send a 
request and receive the corresponding response.

If ZooKeeperClient's `maxInFlightRequests` (10 by default) is reached, a 
request may be held for some time before sending starts. This time is not 
currently measured and it may be useful to know if requests are spending longer 
than usual in the `queue` (conceptually as the current implementation doesn't 
use a queue).

This would require a KIP.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5200) If a replicated topic is deleted with one broker down, it can't be recreated

2017-12-08 Thread Matthias Rampke (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283713#comment-16283713
 ] 

Matthias Rampke commented on KAFKA-5200:


To expand on the workaround [~huxi_2b] proposed:

If you cannot resurrect the dead broker itself, you can make Kafka act as if 
you did

#  Start a new broker, but then shut it down quickly (before any newly created 
partitions are assigned to it).
# in meta.properties, change the broker ID to the one of the dead broker
# Start it
# watch its logs – it will pick up the pending deletions and go through, or you 
can reassign at this point
# stop it again

This may be problematic if you have a lot of partition creation going on, 
because you need to avoid getting any partitions assigned to this broker while 
it's running, but otherwise this works without downtime.

> If a replicated topic is deleted with one broker down, it can't be recreated
> 
>
> Key: KAFKA-5200
> URL: https://issues.apache.org/jira/browse/KAFKA-5200
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Edoardo Comar
>
> In a cluster with 5 broker, replication factor=3, min in sync=2,
> one broker went down 
> A user's app remained of course unaware of that and deleted a topic that 
> (unknowingly) had a replica on the dead broker.
> The topic went in 'pending delete' mode
> The user then tried to recreate the topic - which failed, so his app was left 
> stuck - no working topic and no ability to create one.
> The reassignment tool fails to move the replica out of the dead broker - 
> specifically because the broker with the partition replica to move is dead :-)
> Incidentally the confluent-rebalancer docs say
> http://docs.confluent.io/current/kafka/post-deployment.html#scaling-the-cluster
> > Supports moving partitions away from dead brokers
> It'd be nice to similarly improve the opensource reassignment tool



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6325) Producer.flush() doesn't throw exception on timeout

2017-12-08 Thread Erik Scheuter (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283444#comment-16283444
 ] 

Erik Scheuter commented on KAFKA-6325:
--

I didn't modify the KafkaProducer, but the code which uses it in a way I loop 
through all futures and do a future.get() instead of producer.flush().

Option 2 is the easiest option; change the javadoc of the send() function as 
well as this isn't completly async (or should this be another issue?).

> Producer.flush() doesn't throw exception on timeout
> ---
>
> Key: KAFKA-6325
> URL: https://issues.apache.org/jira/browse/KAFKA-6325
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Reporter: Erik Scheuter
> Attachments: FlushTest.java
>
>
> Reading the javadoc of the flush() method we assumed an exception would've 
> been thrown when an error occurs. This would make the code more 
> understandable as we don't have to return a list of futures if we want to 
> send multiple records to kafka and eventually call future.get().
> When send() is called, the metadata is retrieved and send is blocked on this 
> process. When this process fails (no brokers) an FutureFailure is returned. 
> When you just flush; no exceptions will be thrown (in contrast to 
> future.get()). Ofcourse you can implement callbacks in the send method.
> I think there are two solutions:
> * Change flush() (& doSend()) and throw exceptions
> * Change the javadoc and describe the scenario you can lose events because no 
> exceptions are thrown and the events are not sent.
> I added an unittest to show the behaviour. Kafka doesn't have to be available 
> for this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6289) NetworkClient should not return internal failed api version responses from poll

2017-12-08 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram resolved KAFKA-6289.
---
   Resolution: Fixed
Fix Version/s: 1.0.1
   1.1.0

Issue resolved by pull request 4280
[https://github.com/apache/kafka/pull/4280]

> NetworkClient should not return internal failed api version responses from 
> poll
> ---
>
> Key: KAFKA-6289
> URL: https://issues.apache.org/jira/browse/KAFKA-6289
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
> Fix For: 1.1.0, 1.0.1
>
>
> In the AdminClient, if the initial ApiVersion request sent to the broker 
> fails, we see the following obscure message:
> {code}
> [2017-11-30 17:18:48,677] ERROR Internal server error on -2: server returned 
> information about unknown correlation ID 0.  requestHeader = 
> {api_key=18,api_version=1,correlation_id=0,client_id=adminclient-3} 
> (org.apache.kafka.clients.admin.KafkaAdminClient)
> {code}
> What's happening is that the response to the internal ApiVersion request 
> which is received in NetworkClient is mistakenly being sent to the upper 
> layer (the admin client in this case). The admin wasn't expecting it, so we 
> see this message. Instead, the request should be handled internally in 
> NetworkClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6289) NetworkClient should not return internal failed api version responses from poll

2017-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283362#comment-16283362
 ] 

ASF GitHub Bot commented on KAFKA-6289:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/4280


> NetworkClient should not return internal failed api version responses from 
> poll
> ---
>
> Key: KAFKA-6289
> URL: https://issues.apache.org/jira/browse/KAFKA-6289
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>
> In the AdminClient, if the initial ApiVersion request sent to the broker 
> fails, we see the following obscure message:
> {code}
> [2017-11-30 17:18:48,677] ERROR Internal server error on -2: server returned 
> information about unknown correlation ID 0.  requestHeader = 
> {api_key=18,api_version=1,correlation_id=0,client_id=adminclient-3} 
> (org.apache.kafka.clients.admin.KafkaAdminClient)
> {code}
> What's happening is that the response to the internal ApiVersion request 
> which is received in NetworkClient is mistakenly being sent to the upper 
> layer (the admin client in this case). The admin wasn't expecting it, so we 
> see this message. Instead, the request should be handled internally in 
> NetworkClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6266) Kafka 1.0.0 : Repeated occurrence of WARN Resetting first dirty offset of __consumer_offsets-xx to log start offset 203569 since the checkpointed offset 120955 is inval

2017-12-08 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283358#comment-16283358
 ] 

Ismael Juma commented on KAFKA-6266:


cc [~junrao]

> Kafka 1.0.0 : Repeated occurrence of WARN Resetting first dirty offset of 
> __consumer_offsets-xx to log start offset 203569 since the checkpointed 
> offset 120955 is invalid. (kafka.log.LogCleanerManager$)
> --
>
> Key: KAFKA-6266
> URL: https://issues.apache.org/jira/browse/KAFKA-6266
> Project: Kafka
>  Issue Type: Bug
>  Components: offset manager
>Affects Versions: 1.0.0
> Environment: CentOS 7, Apache kafka_2.12-1.0.0
>Reporter: VinayKumar
>
> I upgraded Kafka from 0.10.2.1 to 1.0.0 version. From then, I see the below 
> warnings in the log.
> I'm seeing these continuously in the log, and want these to be fixed- so that 
> they wont repeat. Can someone please help me in fixing the below warnings.
> WARN Resetting first dirty offset of __consumer_offsets-17 to log start 
> offset 3346 since the checkpointed offset 3332 is invalid. 
> (kafka.log.LogCleanerManager$)
> WARN Resetting first dirty offset of __consumer_offsets-23 to log start 
> offset 4 since the checkpointed offset 1 is invalid. 
> (kafka.log.LogCleanerManager$)
> WARN Resetting first dirty offset of __consumer_offsets-19 to log start 
> offset 203569 since the checkpointed offset 120955 is invalid. 
> (kafka.log.LogCleanerManager$)
> WARN Resetting first dirty offset of __consumer_offsets-35 to log start 
> offset 16957 since the checkpointed offset 7 is invalid. 
> (kafka.log.LogCleanerManager$)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6314) Add a tool to delete kafka based consumer offsets for a given group

2017-12-08 Thread Tom Scott (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283277#comment-16283277
 ] 

Tom Scott commented on KAFKA-6314:
--

thank, I've modified it to reference kafka based offsets

> Add a tool to delete kafka based consumer offsets for a given group
> ---
>
> Key: KAFKA-6314
> URL: https://issues.apache.org/jira/browse/KAFKA-6314
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer, core, tools
>Reporter: Tom Scott
>Priority: Minor
>
> Add a tool to delete kafka based consumer offsets for a given group similar 
> to the reset tool. It could look something like this:
> kafka-consumer-groups --bootstrap-server localhost:9092 --delete-offsets 
> --group somegroup
> The case for this is as follows:
> 1. Consumer group with id: group1 subscribes to topic1
> 2. The group is stopped 
> 3. The subscription changed to topic2 but the id is kept as group1
> Now the out output of kafka-consumer-groups --describe for the group will 
> show topic1 even though the group is not subscribed to that topic. This is 
> bad for monitoring as it will show lag on topic1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-6314) Add a tool to delete kafka based consumer offsets for a given group

2017-12-08 Thread Tom Scott (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom Scott updated KAFKA-6314:
-
Description: 
Add a tool to delete kafka based consumer offsets for a given group similar to 
the reset tool. It could look something like this:

kafka-consumer-groups --bootstrap-server localhost:9092 --delete-offsets 
--group somegroup

The case for this is as follows:

1. Consumer group with id: group1 subscribes to topic1
2. The group is stopped 
3. The subscription changed to topic2 but the id is kept as group1

Now the out output of kafka-consumer-groups --describe for the group will show 
topic1 even though the group is not subscribed to that topic. This is bad for 
monitoring as it will show lag on topic1.






  was:
Add a tool to delete consumer offsets for a given group similar to the reset 
tool. It could look something like this:

kafka-consumer-groups --bootstrap-server localhost:9092 --delete-offsets 
--group somegroup

The case for this is as follows:

1. Consumer group with id: group1 subscribes to topic1
2. The group is stopped 
3. The subscription changed to topic2 but the id is kept as group1

Now the out output of kafka-consumer-groups --describe for the group will show 
topic1 even though the group is not subscribed to that topic. This is bad for 
monitoring as it will show lag on topic1.







> Add a tool to delete kafka based consumer offsets for a given group
> ---
>
> Key: KAFKA-6314
> URL: https://issues.apache.org/jira/browse/KAFKA-6314
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer, core, tools
>Reporter: Tom Scott
>Priority: Minor
>
> Add a tool to delete kafka based consumer offsets for a given group similar 
> to the reset tool. It could look something like this:
> kafka-consumer-groups --bootstrap-server localhost:9092 --delete-offsets 
> --group somegroup
> The case for this is as follows:
> 1. Consumer group with id: group1 subscribes to topic1
> 2. The group is stopped 
> 3. The subscription changed to topic2 but the id is kept as group1
> Now the out output of kafka-consumer-groups --describe for the group will 
> show topic1 even though the group is not subscribed to that topic. This is 
> bad for monitoring as it will show lag on topic1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-6314) Add a tool to delete kafka based consumer offsets for a given group

2017-12-08 Thread Tom Scott (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom Scott updated KAFKA-6314:
-
Summary: Add a tool to delete kafka based consumer offsets for a given 
group  (was: Add a tool to delete consumer offsets for a given group)

> Add a tool to delete kafka based consumer offsets for a given group
> ---
>
> Key: KAFKA-6314
> URL: https://issues.apache.org/jira/browse/KAFKA-6314
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer, core, tools
>Reporter: Tom Scott
>Priority: Minor
>
> Add a tool to delete consumer offsets for a given group similar to the reset 
> tool. It could look something like this:
> kafka-consumer-groups --bootstrap-server localhost:9092 --delete-offsets 
> --group somegroup
> The case for this is as follows:
> 1. Consumer group with id: group1 subscribes to topic1
> 2. The group is stopped 
> 3. The subscription changed to topic2 but the id is kept as group1
> Now the out output of kafka-consumer-groups --describe for the group will 
> show topic1 even though the group is not subscribed to that topic. This is 
> bad for monitoring as it will show lag on topic1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6314) Add a tool to delete consumer offsets for a given group

2017-12-08 Thread James Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283235#comment-16283235
 ] 

James Cheng commented on KAFKA-6314:


FYI, kafka-consumer-groups.sh already has --delete support, but it only works 
for zookeeper-based offsets. 
{noformat}
$ ~/kafka_2.11-1.0.0/bin/kafka-consumer-groups.sh
List all consumer groups, describe a consumer group, delete consumer group 
info, or reset consumer group offsets.
Option  Description
--  ---
--deletePass in groups to delete topic
  partition offsets and ownership
  information over the entire consumer
  group. For instance --group g1 --
  group g2
Pass in groups with a single topic to
  just delete the given topic's
  partition offsets and ownership
  information for the given consumer
  groups. For instance --group g1 --
  group g2 --topic t1
Pass in just a topic to delete the
  given topic's partition offsets and
  ownership information for every
  consumer group. For instance --topic
  t1
WARNING: Group deletion only works for
  old ZK-based consumer groups, and
  one has to use it carefully to only
  delete groups that are not active.
{noformat}

So this JIRA should say that the RFE is to let us delete kafka-based offsets.

> Add a tool to delete consumer offsets for a given group
> ---
>
> Key: KAFKA-6314
> URL: https://issues.apache.org/jira/browse/KAFKA-6314
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer, core, tools
>Reporter: Tom Scott
>Priority: Minor
>
> Add a tool to delete consumer offsets for a given group similar to the reset 
> tool. It could look something like this:
> kafka-consumer-groups --bootstrap-server localhost:9092 --delete-offsets 
> --group somegroup
> The case for this is as follows:
> 1. Consumer group with id: group1 subscribes to topic1
> 2. The group is stopped 
> 3. The subscription changed to topic2 but the id is kept as group1
> Now the out output of kafka-consumer-groups --describe for the group will 
> show topic1 even though the group is not subscribed to that topic. This is 
> bad for monitoring as it will show lag on topic1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-6326) when broker is unavailable(such as broker's machine is down), controller will wait 30 sec timeout

2017-12-08 Thread HongLiang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283207#comment-16283207
 ] 

HongLiang edited comment on KAFKA-6326 at 12/8/17 8:37 AM:
---

[~huxi_2b][{color:red}2017-10-30 14:15:18,009{color}] 
[ZkClient-EventThread-28-] INFO 
[Controller-188-to-broker-187-send-thread], {color:red}Shutting down{color} 
(kafka.controller.RequestSendThread)
[{color:red}2017-10-30 14:15:43,828{color}] 
[Controller-188-to-broker-187-send-thread] WARN 
[Controller-188-to-broker-187-send-thread], Controller 188's connection to 
broker 187:9092 (id: 187 rack: null) was unsuccessful 
(kafka.controller.RequestSendThread)
java.net.SocketTimeoutException: Failed to connect within 3 ms
  at 
kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:237)
  at 
kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:189)
  at 
kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:188)
  at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
[2017-10-30 14:15:43,828] [Controller-188-to-broker-187-send-thread] INFO 
[Controller-188-to-broker-187-send-thread], Stopped  
(kafka.controller.RequestSendThread)
[2017-10-30 14:15:43,828] [ZkClient-EventThread-28-xxx] INFO 
[Controller-188-to-broker-187-send-thread], Shutdown completed 
(kafka.controller.RequestSendThread)


was (Author: hongliang):
[~huxi_2b][{color:red}2017-10-30 14:15:18,009{color}] 
[ZkClient-EventThread-28-] INFO 
[Controller-188-to-broker-187-send-thread], {color:red}Shutting down{color} 
(kafka.controller.RequestSendThread)

> when broker is unavailable(such as broker's machine is down), controller will 
> wait 30 sec timeout 
> --
>
> Key: KAFKA-6326
> URL: https://issues.apache.org/jira/browse/KAFKA-6326
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 1.0.0
>Reporter: HongLiang
> Attachments: _f7d1d2b4-39ae-4e02-8519-99bcba849668.png
>
>
> when broker is unavailable(such as broker's machine is down), controller will 
> wait 30 sec timeout by dedault. it seems to be that the timeout waiting is 
> not necessary. It will be increase the MTTR of dead broker .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6326) when broker is unavailable(such as broker's machine is down), controller will wait 30 sec timeout

2017-12-08 Thread HongLiang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283207#comment-16283207
 ] 

HongLiang commented on KAFKA-6326:
--

[~huxi_2b][{color:red}2017-10-30 14:15:18,009{color}] 
[ZkClient-EventThread-28-] INFO 
[Controller-188-to-broker-187-send-thread], {color:red}Shutting down{color} 
(kafka.controller.RequestSendThread)

> when broker is unavailable(such as broker's machine is down), controller will 
> wait 30 sec timeout 
> --
>
> Key: KAFKA-6326
> URL: https://issues.apache.org/jira/browse/KAFKA-6326
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 1.0.0
>Reporter: HongLiang
> Attachments: _f7d1d2b4-39ae-4e02-8519-99bcba849668.png
>
>
> when broker is unavailable(such as broker's machine is down), controller will 
> wait 30 sec timeout by dedault. it seems to be that the timeout waiting is 
> not necessary. It will be increase the MTTR of dead broker .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6326) when broker is unavailable(such as broker's machine is down), controller will wait 30 sec timeout

2017-12-08 Thread huxihx (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283197#comment-16283197
 ] 

huxihx commented on KAFKA-6326:
---

[~HongLiang] Could you offer the exact timestamp when shutting down the 
RequestSendThread? Cannot tell from the screenshot?  Search "Shutting down 
(kafka.controller.RequestSendThread)" in the log and find out. 

> when broker is unavailable(such as broker's machine is down), controller will 
> wait 30 sec timeout 
> --
>
> Key: KAFKA-6326
> URL: https://issues.apache.org/jira/browse/KAFKA-6326
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 1.0.0
>Reporter: HongLiang
> Attachments: _f7d1d2b4-39ae-4e02-8519-99bcba849668.png
>
>
> when broker is unavailable(such as broker's machine is down), controller will 
> wait 30 sec timeout by dedault. it seems to be that the timeout waiting is 
> not necessary. It will be increase the MTTR of dead broker .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)