[jira] [Created] (KAFKA-7538) Improve locking model used to update ISRs and HW

2018-10-24 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-7538:
-

 Summary: Improve locking model used to update ISRs and HW
 Key: KAFKA-7538
 URL: https://issues.apache.org/jira/browse/KAFKA-7538
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 2.1.0
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 2.2.0


We currently use a ReadWriteLock in Partition to update ISRs and high water 
mark for the partition. This can result in severe lock contention if there are 
multiple producers writing a large amount of data into a single partition.

The current locking model is:
 # read lock while appending to log on every Produce request on the request 
handler thread
 # write lock on leader change, updating ISRs etc. on request handler or 
scheduler thread
 # write lock on every replica fetch request to check if ISRs need to be 
updated and to update HW and ISR on the request handler thread

2) is infrequent, but 1) and 3) may be frequent and can result in lock 
contention. If there are lots of produce requests to a partition from multiple 
processes, on the leader broker we may see:
 # one slow log append locks up one request thread for that produce while 
holding onto the read lock
 # (replicationFactor-1) request threads can be blocked waiting for write lock 
to process replica fetch request
 # potentially several other request threads processing Produce may be queued 
up to acquire read lock because of the waiting writers.

In a thread dump with this issue, we noticed several request threads blocked 
waiting for write, possibly to due to replication fetch retries.

 

Possible fixes:
 # Process `Partition#maybeExpandIsr` on a single scheduler thread similar to 
`Partition#maybeShrinkIsr` so that only a single thread is blocked on the write 
lock. But this will delay updating ISRs and HW.
 # Change locking in `Partition#maybeExpandIsr` so that only read lock is 
acquired to check if ISR needs updating and write lock is acquired only to 
update ISRs. Also use a different lock for updating HW (perhaps just the 
Partition object lock) so that typical replica fetch requests complete without 
acquiring Partition write lock on the request handler thread.

I will submit a PR for 2) , but other suggestions to fix this are welcome.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-7509) Kafka Connect logs unnecessary warnings about unused configurations

2018-10-24 Thread JIRA


[ 
https://issues.apache.org/jira/browse/KAFKA-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660868#comment-16660868
 ] 

Sönke Liebau edited comment on KAFKA-7509 at 10/24/18 12:24 PM:


Hi [~rhauch], wouldn't this take away the ability to pass configuration options 
to custom interceptor and partitioner classes?

This is probably not something that is widely done, but I can think of a few 
use cases like making a call to an external system from a partitioner and 
setting the target url in configuration. Or, come to think of it, the control 
center interceptors also allow some configuration parameters to be passed to 
them that are not defined in ProducerConfig I think.

 

Would it be an option to add prefixes to the config classes? Say anything 
starting with authorizer.config. would be allowed, hence we can give people the 
possibility of adding options in to pluggable classes but still be able to 
properly validate the config to some extent.

 


was (Author: sliebau):
Hi [~rhauch], wouldn't this take away the ability to pass configuration options 
to custom interceptor and partitioner classes?

This is probably not something that is widely done, but I can think of a few 
use cases like making a call to an external system from a partitioner and 
setting the target url in configuration. Or, come to think of it, the control 
center interceptors also allow some configuration parameters to be passed to 
them that are not defined in ProducerConfig I think.

 

> Kafka Connect logs unnecessary warnings about unused configurations
> ---
>
> Key: KAFKA-7509
> URL: https://issues.apache.org/jira/browse/KAFKA-7509
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Randall Hauch
>Assignee: Randall Hauch
>Priority: Major
>
> When running Connect, the logs contain quite a few warnings about "The 
> configuration '{}' was supplied but isn't a known config." This occurs when 
> Connect creates producers, consumers, and admin clients, because the 
> AbstractConfig is logging unused configuration properties upon construction. 
> It's complicated by the fact that the Producer, Consumer, and AdminClient all 
> create their own AbstractConfig instances within the constructor, so we can't 
> even call its {{ignore(String key)}} method.
> See also KAFKA-6793 for a similar issue with Streams.
> There are no arguments in the Producer, Consumer, or AdminClient constructors 
> to control  whether the configs log these warnings, so a simpler workaround 
> is to only pass those configuration properties to the Producer, Consumer, and 
> AdminClient that the ProducerConfig, ConsumerConfig, and AdminClientConfig 
> configdefs know about.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7536) TopologyTestDriver cannot pre-populate KTable

2018-10-24 Thread Dmitry Minkovsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Minkovsky updated KAFKA-7536:

Description: 
I have a GlobalKTable that's defined as

{code}

GlobalKTable userIdsByEmail = topology  
   .globalTable(USER_IDS_BY_EMAIL.name,
   USER_IDS_BY_EMAIL.consumed(),
   Materialized.as("user-ids-by-email"));
{code}

And the following test in Spock:

{code}
def topology = // my topology
def driver = new TopologyTestDriver(topology, config())

def cleanup() {
driver.close()
}

def "create from email request"() {

def store = driver.getKeyValueStore('user-ids-by-email')
store.put('string', ByteString.copyFrom(new byte[0]))
// more, but it fails at the `put` above
{code}

When I run this, I get the following:

{code}

[2018-10-23 19:35:27,055] INFO 
(org.apache.kafka.streams.processor.internals.GlobalStateManagerImpl) mock 
Restoring state for global store user-ids-by-email

java.lang.NullPointerException
at 
org.apache.kafka.streams.processor.internals.AbstractProcessorContext.topic(AbstractProcessorContext.java:115)
at 
org.apache.kafka.streams.state.internals.CachingKeyValueStore.putInternal(CachingKeyValueStore.java:237)
at 
org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:220)
at 
org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:38)
at 
org.apache.kafka.streams.state.internals.InnerMeteredKeyValueStore.put(InnerMeteredKeyValueStore.java:206)
at 
org.apache.kafka.streams.state.internals.MeteredKeyValueBytesStore.put(MeteredKeyValueBytesStore.java:117)
at pony.message.MessageWriteStreamsTest.create from mailgun email 
request(MessageWriteStreamsTest.groovy:52)

[2018-10-23 19:35:27,189] INFO 
(org.apache.kafka.streams.processor.internals.StateDirectory) stream-thread 
[main] Deleting state directory 0_0 for task 0_0 as user calling cleanup.
{code}

The same issue applies to KTable.

I've noticed that I can {{put()}} to the store if I first write to it with 
{{driver.pipeInput}}. But otherwise I get the above error.

  was:
I have a GlobalKTable that's defined as

{code}

GlobalKTable userIdsByEmail = topology  
   .globalTable(USER_IDS_BY_EMAIL.name,
   USER_IDS_BY_EMAIL.consumed(),
   Materialized.as("user-ids-by-email"));
{code}

And the following test in Spock:

{code}
def topology = // my topology
def driver = new TopologyTestDriver(topology, config())

def cleanup() {
driver.close()
}

def "create from email request"() {

def store = driver.getKeyValueStore('user-ids-by-email')
store.put('string', ByteString.copyFrom(new byte[0]))
// more, but it fails at the `put` above
{code}

When I run this, I get the following:

{code}

[2018-10-23 19:35:27,055] INFO 
(org.apache.kafka.streams.processor.internals.GlobalStateManagerImpl) mock 
Restoring state for global store user-ids-by-email

java.lang.NullPointerException
at 
org.apache.kafka.streams.processor.internals.AbstractProcessorContext.topic(AbstractProcessorContext.java:115)
at 
org.apache.kafka.streams.state.internals.CachingKeyValueStore.putInternal(CachingKeyValueStore.java:237)
at 
org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:220)
at 
org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:38)
at 
org.apache.kafka.streams.state.internals.InnerMeteredKeyValueStore.put(InnerMeteredKeyValueStore.java:206)
at 
org.apache.kafka.streams.state.internals.MeteredKeyValueBytesStore.put(MeteredKeyValueBytesStore.java:117)
at pony.message.MessageWriteStreamsTest.create from mailgun email 
request(MessageWriteStreamsTest.groovy:52)

[2018-10-23 19:35:27,189] INFO 
(org.apache.kafka.streams.processor.internals.StateDirectory) stream-thread 
[main] Deleting state directory 0_0 for task 0_0 as user calling cleanup.
{code}

I've noticed that I can {{put()}} to the store if I first write to it with 
{{driver.pipeInput}}. But otherwise I get the above error.


> TopologyTestDriver cannot pre-populate KTable
> -
>
> Key: KAFKA-7536
> URL: https://issues.apache.org/jira/browse/KAFKA-7536
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.0
>Reporter: Dmitry Minkovsky
>Priority: Minor
>
> I have a GlobalKTable that's defined as
> {code}
> GlobalKTable userIdsByEmail = topology  
>.globalTable(USER_IDS_BY_EMAIL.name,
>USER_IDS_BY_EMAIL.consumed(),
>

[jira] [Updated] (KAFKA-7536) TopologyTestDriver cannot pre-populate KTable

2018-10-24 Thread Dmitry Minkovsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Minkovsky updated KAFKA-7536:

Summary: TopologyTestDriver cannot pre-populate KTable  (was: 
TopologyTestDriver cannot pre-populate GlobalKTable)

> TopologyTestDriver cannot pre-populate KTable
> -
>
> Key: KAFKA-7536
> URL: https://issues.apache.org/jira/browse/KAFKA-7536
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.0
>Reporter: Dmitry Minkovsky
>Priority: Minor
>
> I have a GlobalKTable that's defined as
> {code}
> GlobalKTable userIdsByEmail = topology  
>.globalTable(USER_IDS_BY_EMAIL.name,
>USER_IDS_BY_EMAIL.consumed(),
>Materialized.as("user-ids-by-email"));
> {code}
> And the following test in Spock:
> {code}
> def topology = // my topology
> def driver = new TopologyTestDriver(topology, config())
> def cleanup() {
> driver.close()
> }
> def "create from email request"() {
> def store = driver.getKeyValueStore('user-ids-by-email')
> store.put('string', ByteString.copyFrom(new byte[0]))
> // more, but it fails at the `put` above
> {code}
> When I run this, I get the following:
> {code}
> [2018-10-23 19:35:27,055] INFO 
> (org.apache.kafka.streams.processor.internals.GlobalStateManagerImpl) mock 
> Restoring state for global store user-ids-by-email
> java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.AbstractProcessorContext.topic(AbstractProcessorContext.java:115)
>   at 
> org.apache.kafka.streams.state.internals.CachingKeyValueStore.putInternal(CachingKeyValueStore.java:237)
>   at 
> org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:220)
>   at 
> org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:38)
>   at 
> org.apache.kafka.streams.state.internals.InnerMeteredKeyValueStore.put(InnerMeteredKeyValueStore.java:206)
>   at 
> org.apache.kafka.streams.state.internals.MeteredKeyValueBytesStore.put(MeteredKeyValueBytesStore.java:117)
>   at pony.message.MessageWriteStreamsTest.create from mailgun email 
> request(MessageWriteStreamsTest.groovy:52)
> [2018-10-23 19:35:27,189] INFO 
> (org.apache.kafka.streams.processor.internals.StateDirectory) stream-thread 
> [main] Deleting state directory 0_0 for task 0_0 as user calling cleanup.
> {code}
> I've noticed that I can {{put()}} to the store if I first write to it with 
> {{driver.pipeInput}}. But otherwise I get the above error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-3932) Consumer fails to consume in a round robin fashion

2018-10-24 Thread CHIENHSING WU (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662386#comment-16662386
 ] 

CHIENHSING WU commented on KAFKA-3932:
--

I am going ahead to implement the change I propose. I don't have the right to 
assign this ticket to myself. *This comment hopefully alert anyone that plans 
to work on it.*

> Consumer fails to consume in a round robin fashion
> --
>
> Key: KAFKA-3932
> URL: https://issues.apache.org/jira/browse/KAFKA-3932
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.0.0
>Reporter: Elias Levy
>Priority: Major
>
> The Java consumer fails consume messages in a round robin fashion.  This can 
> lead to an unbalance consumption.
> In our use case we have a set of consumer that can take a significant amount 
> of time consuming messages off a topic.  For this reason, we are using the 
> pause/poll/resume pattern to ensure the consumer session is not timeout.  The 
> topic that is being consumed has been preloaded with message.  That means 
> there is a significant message lag when the consumer is first started.  To 
> limit how many messages are consumed at a time, the consumer has been 
> configured with max.poll.records=1.
> The first initial observation is that the client receive a large batch of 
> messages for the first partition it decides to consume from and will consume 
> all those messages before moving on, rather than returning a message from a 
> different partition for each call to poll.
> We solved this issue by configuring max.partition.fetch.bytes to be small 
> enough that only a single message will be returned by the broker on each 
> fetch, although this would not be feasible if message size were highly 
> variable.
> The behavior of the consumer after this change is to largely consume from a 
> small number of partitions, usually just two, iterating between them, until 
> it exhausts them, before moving to another partition.   This behavior is 
> problematic if the messages have some rough time semantics and need to be 
> process roughly time ordered across all partitions.
> It would be useful if the consumer has a pluggable API that allowed custom 
> logic to select which partition to consume from next, thus enabling the 
> creation of a round robin partition consumer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7536) TopologyTestDriver cannot pre-populate KTable

2018-10-24 Thread Dmitry Minkovsky (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662413#comment-16662413
 ] 

Dmitry Minkovsky commented on KAFKA-7536:
-

Thank you.

Yes, disabling cache makes the issue go away.

I tested, and this affects regular KTable as well. Disabling cache makes that 
go away too. 

Not sure how I will work around this, either by abstracting store creation to 
make disabling cache easy for tests, or just by piping input. Looks like it 
will depend on the situation.

> TopologyTestDriver cannot pre-populate KTable
> -
>
> Key: KAFKA-7536
> URL: https://issues.apache.org/jira/browse/KAFKA-7536
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.0
>Reporter: Dmitry Minkovsky
>Priority: Minor
>
> I have a GlobalKTable that's defined as
> {code}
> GlobalKTable userIdsByEmail = topology  
>.globalTable(USER_IDS_BY_EMAIL.name,
>USER_IDS_BY_EMAIL.consumed(),
>Materialized.as("user-ids-by-email"));
> {code}
> And the following test in Spock:
> {code}
> def topology = // my topology
> def driver = new TopologyTestDriver(topology, config())
> def cleanup() {
> driver.close()
> }
> def "create from email request"() {
> def store = driver.getKeyValueStore('user-ids-by-email')
> store.put('string', ByteString.copyFrom(new byte[0]))
> // more, but it fails at the `put` above
> {code}
> When I run this, I get the following:
> {code}
> [2018-10-23 19:35:27,055] INFO 
> (org.apache.kafka.streams.processor.internals.GlobalStateManagerImpl) mock 
> Restoring state for global store user-ids-by-email
> java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.AbstractProcessorContext.topic(AbstractProcessorContext.java:115)
>   at 
> org.apache.kafka.streams.state.internals.CachingKeyValueStore.putInternal(CachingKeyValueStore.java:237)
>   at 
> org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:220)
>   at 
> org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:38)
>   at 
> org.apache.kafka.streams.state.internals.InnerMeteredKeyValueStore.put(InnerMeteredKeyValueStore.java:206)
>   at 
> org.apache.kafka.streams.state.internals.MeteredKeyValueBytesStore.put(MeteredKeyValueBytesStore.java:117)
>   at pony.message.MessageWriteStreamsTest.create from mailgun email 
> request(MessageWriteStreamsTest.groovy:52)
> [2018-10-23 19:35:27,189] INFO 
> (org.apache.kafka.streams.processor.internals.StateDirectory) stream-thread 
> [main] Deleting state directory 0_0 for task 0_0 as user calling cleanup.
> {code}
> I've noticed that I can {{put()}} to the store if I first write to it with 
> {{driver.pipeInput}}. But otherwise I get the above error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7481) Consider options for safer upgrade of offset commit value schema

2018-10-24 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662460#comment-16662460
 ] 

Ewen Cheslack-Postava commented on KAFKA-7481:
--

[~lindong] we might just want to update the script with a flag to override any 
remaining blockers (and maybe even just list them in the notes for the RC). 
Ideally we don't hit this type of situation regularly, but when we do, might be 
better to be able to override on the release mgr side rather than possibly 
losing stuff by adjusting priorities on the JIRA.

> Consider options for safer upgrade of offset commit value schema
> 
>
> Key: KAFKA-7481
> URL: https://issues.apache.org/jira/browse/KAFKA-7481
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Priority: Critical
> Fix For: 2.1.0
>
>
> KIP-211 and KIP-320 add new versions of the offset commit value schema. The 
> use of the new schema version is controlled by the 
> `inter.broker.protocol.version` configuration.  Once the new inter-broker 
> version is in use, it is not possible to downgrade since the older brokers 
> will not be able to parse the new schema. 
> The options at the moment are the following:
> 1. Do nothing. Users can try the new version and keep 
> `inter.broker.protocol.version` locked to the old release. Downgrade will 
> still be possible, but users will not be able to test new capabilities which 
> depend on inter-broker protocol changes.
> 2. Instead of using `inter.broker.protocol.version`, we could use 
> `message.format.version`. This would basically extend the use of this config 
> to apply to all persistent formats. The advantage is that it allows users to 
> upgrade the broker and begin using the new inter-broker protocol while still 
> allowing downgrade. But features which depend on the persistent format could 
> not be tested.
> Any other options?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7538) Improve locking model used to update ISRs and HW

2018-10-24 Thread Jun Rao (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662538#comment-16662538
 ] 

Jun Rao commented on KAFKA-7538:


[~rsivaram], thanks for the analysis. I agree that approach 2) is probably the 
easiest fix at this moment. I am wondering how much this will help though. With 
the fix, other producers will be able to proceed to append to the log. If the 
log append is slow, then all those produce requests will be blocked too, tying 
up all the request handlers. So, it seems that we will still need to fix the 
root cause of the problem, which is the slow log append.

> Improve locking model used to update ISRs and HW
> 
>
> Key: KAFKA-7538
> URL: https://issues.apache.org/jira/browse/KAFKA-7538
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.1.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Major
> Fix For: 2.2.0
>
>
> We currently use a ReadWriteLock in Partition to update ISRs and high water 
> mark for the partition. This can result in severe lock contention if there 
> are multiple producers writing a large amount of data into a single partition.
> The current locking model is:
>  # read lock while appending to log on every Produce request on the request 
> handler thread
>  # write lock on leader change, updating ISRs etc. on request handler or 
> scheduler thread
>  # write lock on every replica fetch request to check if ISRs need to be 
> updated and to update HW and ISR on the request handler thread
> 2) is infrequent, but 1) and 3) may be frequent and can result in lock 
> contention. If there are lots of produce requests to a partition from 
> multiple processes, on the leader broker we may see:
>  # one slow log append locks up one request thread for that produce while 
> holding onto the read lock
>  # (replicationFactor-1) request threads can be blocked waiting for write 
> lock to process replica fetch request
>  # potentially several other request threads processing Produce may be queued 
> up to acquire read lock because of the waiting writers.
> In a thread dump with this issue, we noticed several request threads blocked 
> waiting for write, possibly to due to replication fetch retries.
>  
> Possible fixes:
>  # Process `Partition#maybeExpandIsr` on a single scheduler thread similar to 
> `Partition#maybeShrinkIsr` so that only a single thread is blocked on the 
> write lock. But this will delay updating ISRs and HW.
>  # Change locking in `Partition#maybeExpandIsr` so that only read lock is 
> acquired to check if ISR needs updating and write lock is acquired only to 
> update ISRs. Also use a different lock for updating HW (perhaps just the 
> Partition object lock) so that typical replica fetch requests complete 
> without acquiring Partition write lock on the request handler thread.
> I will submit a PR for 2) , but other suggestions to fix this are welcome.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7481) Consider options for safer upgrade of offset commit value schema

2018-10-24 Thread Dong Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662555#comment-16662555
 ] 

Dong Lin commented on KAFKA-7481:
-

Thanks [~ewencp] for the suggestion. I agree. Will do this in the future.

> Consider options for safer upgrade of offset commit value schema
> 
>
> Key: KAFKA-7481
> URL: https://issues.apache.org/jira/browse/KAFKA-7481
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Priority: Critical
> Fix For: 2.1.0
>
>
> KIP-211 and KIP-320 add new versions of the offset commit value schema. The 
> use of the new schema version is controlled by the 
> `inter.broker.protocol.version` configuration.  Once the new inter-broker 
> version is in use, it is not possible to downgrade since the older brokers 
> will not be able to parse the new schema. 
> The options at the moment are the following:
> 1. Do nothing. Users can try the new version and keep 
> `inter.broker.protocol.version` locked to the old release. Downgrade will 
> still be possible, but users will not be able to test new capabilities which 
> depend on inter-broker protocol changes.
> 2. Instead of using `inter.broker.protocol.version`, we could use 
> `message.format.version`. This would basically extend the use of this config 
> to apply to all persistent formats. The advantage is that it allows users to 
> upgrade the broker and begin using the new inter-broker protocol while still 
> allowing downgrade. But features which depend on the persistent format could 
> not be tested.
> Any other options?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7536) TopologyTestDriver cannot pre-populate KTable or GlobalKTable

2018-10-24 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-7536:
---
Summary: TopologyTestDriver cannot pre-populate KTable or GlobalKTable  
(was: TopologyTestDriver cannot pre-populate KTable)

> TopologyTestDriver cannot pre-populate KTable or GlobalKTable
> -
>
> Key: KAFKA-7536
> URL: https://issues.apache.org/jira/browse/KAFKA-7536
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.0
>Reporter: Dmitry Minkovsky
>Priority: Minor
>
> I have a GlobalKTable that's defined as
> {code}
> GlobalKTable userIdsByEmail = topology  
>.globalTable(USER_IDS_BY_EMAIL.name,
>USER_IDS_BY_EMAIL.consumed(),
>Materialized.as("user-ids-by-email"));
> {code}
> And the following test in Spock:
> {code}
> def topology = // my topology
> def driver = new TopologyTestDriver(topology, config())
> def cleanup() {
> driver.close()
> }
> def "create from email request"() {
> def store = driver.getKeyValueStore('user-ids-by-email')
> store.put('string', ByteString.copyFrom(new byte[0]))
> // more, but it fails at the `put` above
> {code}
> When I run this, I get the following:
> {code}
> [2018-10-23 19:35:27,055] INFO 
> (org.apache.kafka.streams.processor.internals.GlobalStateManagerImpl) mock 
> Restoring state for global store user-ids-by-email
> java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.AbstractProcessorContext.topic(AbstractProcessorContext.java:115)
>   at 
> org.apache.kafka.streams.state.internals.CachingKeyValueStore.putInternal(CachingKeyValueStore.java:237)
>   at 
> org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:220)
>   at 
> org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:38)
>   at 
> org.apache.kafka.streams.state.internals.InnerMeteredKeyValueStore.put(InnerMeteredKeyValueStore.java:206)
>   at 
> org.apache.kafka.streams.state.internals.MeteredKeyValueBytesStore.put(MeteredKeyValueBytesStore.java:117)
>   at pony.message.MessageWriteStreamsTest.create from mailgun email 
> request(MessageWriteStreamsTest.groovy:52)
> [2018-10-23 19:35:27,189] INFO 
> (org.apache.kafka.streams.processor.internals.StateDirectory) stream-thread 
> [main] Deleting state directory 0_0 for task 0_0 as user calling cleanup.
> {code}
> The same issue applies to KTable.
> I've noticed that I can {{put()}} to the store if I first write to it with 
> {{driver.pipeInput}}. But otherwise I get the above error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7535) KafkaConsumer doesn't report records-lag if isolation.level is read_committed

2018-10-24 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7535:
---
Fix Version/s: 2.0.2
   2.1.1

> KafkaConsumer doesn't report records-lag if isolation.level is read_committed
> -
>
> Key: KAFKA-7535
> URL: https://issues.apache.org/jira/browse/KAFKA-7535
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.0.0
>Reporter: Alexey Vakhrenev
>Assignee: lambdaliu
>Priority: Major
>  Labels: regression
> Fix For: 2.1.1, 2.0.2
>
>
> Starting from 2.0.0, {{KafkaConsumer}} doesn't report {{records-lag}} if 
> {{isolation.level}} is {{read_committed}}. The last version, where it works 
> is {{1.1.1}}.
> The issue can be easily reproduced in {{kafka.api.PlaintextConsumerTest}} by 
> adding {{consumerConfig.setProperty("isolation.level", "read_committed")}} 
> witin related tests:
>  - {{testPerPartitionLagMetricsCleanUpWithAssign}}
>  - {{testPerPartitionLagMetricsCleanUpWithSubscribe}}
>  - {{testPerPartitionLagWithMaxPollRecords}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7535) KafkaConsumer doesn't report records-lag if isolation.level is read_committed

2018-10-24 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7535:
---
Labels: regression  (was: )

> KafkaConsumer doesn't report records-lag if isolation.level is read_committed
> -
>
> Key: KAFKA-7535
> URL: https://issues.apache.org/jira/browse/KAFKA-7535
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.0.0
>Reporter: Alexey Vakhrenev
>Assignee: lambdaliu
>Priority: Major
>  Labels: regression
> Fix For: 2.1.1, 2.0.2
>
>
> Starting from 2.0.0, {{KafkaConsumer}} doesn't report {{records-lag}} if 
> {{isolation.level}} is {{read_committed}}. The last version, where it works 
> is {{1.1.1}}.
> The issue can be easily reproduced in {{kafka.api.PlaintextConsumerTest}} by 
> adding {{consumerConfig.setProperty("isolation.level", "read_committed")}} 
> witin related tests:
>  - {{testPerPartitionLagMetricsCleanUpWithAssign}}
>  - {{testPerPartitionLagMetricsCleanUpWithSubscribe}}
>  - {{testPerPartitionLagWithMaxPollRecords}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7536) TopologyTestDriver cannot pre-populate KTable or GlobalKTable

2018-10-24 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662566#comment-16662566
 ] 

Matthias J. Sax commented on KAFKA-7536:


Thanks for the information! Really helpful.

> TopologyTestDriver cannot pre-populate KTable or GlobalKTable
> -
>
> Key: KAFKA-7536
> URL: https://issues.apache.org/jira/browse/KAFKA-7536
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.0
>Reporter: Dmitry Minkovsky
>Priority: Minor
>
> I have a GlobalKTable that's defined as
> {code}
> GlobalKTable userIdsByEmail = topology  
>.globalTable(USER_IDS_BY_EMAIL.name,
>USER_IDS_BY_EMAIL.consumed(),
>Materialized.as("user-ids-by-email"));
> {code}
> And the following test in Spock:
> {code}
> def topology = // my topology
> def driver = new TopologyTestDriver(topology, config())
> def cleanup() {
> driver.close()
> }
> def "create from email request"() {
> def store = driver.getKeyValueStore('user-ids-by-email')
> store.put('string', ByteString.copyFrom(new byte[0]))
> // more, but it fails at the `put` above
> {code}
> When I run this, I get the following:
> {code}
> [2018-10-23 19:35:27,055] INFO 
> (org.apache.kafka.streams.processor.internals.GlobalStateManagerImpl) mock 
> Restoring state for global store user-ids-by-email
> java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.AbstractProcessorContext.topic(AbstractProcessorContext.java:115)
>   at 
> org.apache.kafka.streams.state.internals.CachingKeyValueStore.putInternal(CachingKeyValueStore.java:237)
>   at 
> org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:220)
>   at 
> org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:38)
>   at 
> org.apache.kafka.streams.state.internals.InnerMeteredKeyValueStore.put(InnerMeteredKeyValueStore.java:206)
>   at 
> org.apache.kafka.streams.state.internals.MeteredKeyValueBytesStore.put(MeteredKeyValueBytesStore.java:117)
>   at pony.message.MessageWriteStreamsTest.create from mailgun email 
> request(MessageWriteStreamsTest.groovy:52)
> [2018-10-23 19:35:27,189] INFO 
> (org.apache.kafka.streams.processor.internals.StateDirectory) stream-thread 
> [main] Deleting state directory 0_0 for task 0_0 as user calling cleanup.
> {code}
> The same issue applies to KTable.
> I've noticed that I can {{put()}} to the store if I first write to it with 
> {{driver.pipeInput}}. But otherwise I get the above error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7481) Consider options for safer upgrade of offset commit value schema

2018-10-24 Thread Dong Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin updated KAFKA-7481:

Priority: Blocker  (was: Critical)

> Consider options for safer upgrade of offset commit value schema
> 
>
> Key: KAFKA-7481
> URL: https://issues.apache.org/jira/browse/KAFKA-7481
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Priority: Blocker
> Fix For: 2.1.0
>
>
> KIP-211 and KIP-320 add new versions of the offset commit value schema. The 
> use of the new schema version is controlled by the 
> `inter.broker.protocol.version` configuration.  Once the new inter-broker 
> version is in use, it is not possible to downgrade since the older brokers 
> will not be able to parse the new schema. 
> The options at the moment are the following:
> 1. Do nothing. Users can try the new version and keep 
> `inter.broker.protocol.version` locked to the old release. Downgrade will 
> still be possible, but users will not be able to test new capabilities which 
> depend on inter-broker protocol changes.
> 2. Instead of using `inter.broker.protocol.version`, we could use 
> `message.format.version`. This would basically extend the use of this config 
> to apply to all persistent formats. The advantage is that it allows users to 
> upgrade the broker and begin using the new inter-broker protocol while still 
> allowing downgrade. But features which depend on the persistent format could 
> not be tested.
> Any other options?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-3932) Consumer fails to consume in a round robin fashion

2018-10-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662625#comment-16662625
 ] 

ASF GitHub Bot commented on KAFKA-3932:
---

chienhsingwu opened a new pull request #5838: KAFKA-3932 - Consumer fails to 
consume in a round robin fashion
URL: https://github.com/apache/kafka/pull/5838
 
 
   I think the issue is the statement in 
[PIP](https://cwiki.apache.org/confluence/display/KAFKA/KIP-41%3A+KafkaConsumer+Max+Records):
 "As before, we'd keep track of **which partition we left off** at so that the 
next iteration **would begin there**." I think it should **NOT** use the last 
partition in the next iteration; **it should pick the next one instead**.
   
   The simplest solution to impose the order to pick the next one is to use the 
order the consumer.internals.Fetcher receives the partition messages, as 
determined by **completedFetches** queue in that class. To avoid parsing the 
partition messages repeatedly. we can **save those parsed fetches to a list and 
maintain the next partition to get messages there.** 
   
   @hachikuji, @lindong28, @guozhangwang and others, please review.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Consumer fails to consume in a round robin fashion
> --
>
> Key: KAFKA-3932
> URL: https://issues.apache.org/jira/browse/KAFKA-3932
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.0.0
>Reporter: Elias Levy
>Priority: Major
>
> The Java consumer fails consume messages in a round robin fashion.  This can 
> lead to an unbalance consumption.
> In our use case we have a set of consumer that can take a significant amount 
> of time consuming messages off a topic.  For this reason, we are using the 
> pause/poll/resume pattern to ensure the consumer session is not timeout.  The 
> topic that is being consumed has been preloaded with message.  That means 
> there is a significant message lag when the consumer is first started.  To 
> limit how many messages are consumed at a time, the consumer has been 
> configured with max.poll.records=1.
> The first initial observation is that the client receive a large batch of 
> messages for the first partition it decides to consume from and will consume 
> all those messages before moving on, rather than returning a message from a 
> different partition for each call to poll.
> We solved this issue by configuring max.partition.fetch.bytes to be small 
> enough that only a single message will be returned by the broker on each 
> fetch, although this would not be feasible if message size were highly 
> variable.
> The behavior of the consumer after this change is to largely consume from a 
> small number of partitions, usually just two, iterating between them, until 
> it exhausts them, before moving to another partition.   This behavior is 
> problematic if the messages have some rough time semantics and need to be 
> process roughly time ordered across all partitions.
> It would be useful if the consumer has a pluggable API that allowed custom 
> logic to select which partition to consume from next, thus enabling the 
> creation of a round robin partition consumer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7535) KafkaConsumer doesn't report records-lag if isolation.level is read_committed

2018-10-24 Thread Ismael Juma (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662812#comment-16662812
 ] 

Ismael Juma commented on KAFKA-7535:


cc [~lindong] [~omkreddy] in case you think this is important enough to include 
in 2.01/2.1.0. It's a regression, but I haven't verified the severity aside 
from the consumer lag issue.

> KafkaConsumer doesn't report records-lag if isolation.level is read_committed
> -
>
> Key: KAFKA-7535
> URL: https://issues.apache.org/jira/browse/KAFKA-7535
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.0.0
>Reporter: Alexey Vakhrenev
>Assignee: lambdaliu
>Priority: Major
>  Labels: regression
> Fix For: 2.1.1, 2.0.2
>
>
> Starting from 2.0.0, {{KafkaConsumer}} doesn't report {{records-lag}} if 
> {{isolation.level}} is {{read_committed}}. The last version, where it works 
> is {{1.1.1}}.
> The issue can be easily reproduced in {{kafka.api.PlaintextConsumerTest}} by 
> adding {{consumerConfig.setProperty("isolation.level", "read_committed")}} 
> witin related tests:
>  - {{testPerPartitionLagMetricsCleanUpWithAssign}}
>  - {{testPerPartitionLagMetricsCleanUpWithSubscribe}}
>  - {{testPerPartitionLagWithMaxPollRecords}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7481) Consider options for safer upgrade of offset commit value schema

2018-10-24 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16663144#comment-16663144
 ] 

Ewen Cheslack-Postava commented on KAFKA-7481:
--

[~lindong] I think anything here ends up having at least *some* blocker 
component for 2.1 because we're either a) changing the previously understood 
(even if not promised) semantics for a setting or b) need to at least do some 
docs/upgrade notes clarifications.

I think we may also need to consider both short and long term solutions. 
Anything relying on KIP-35 to check inter-broker protocol compatibility seems 
like not a good idea for 2.1 this late in the release cycle. (I mentioned for 
completeness & getting to a good long term solution even if we have a 
short-term hack, but I don't think that's a practical change to get into 2.1 at 
this point.) Also, even if we switch to using KIP-35, there's a whole 
compatibility story wrt existing settings that would need to be worked out.

re: your specific proposal, it generally sounds reasonable but wrt testing 
surface area, it actually does increase it to some degree because we would need 
tests that validate correct behavior via KIP-35 checks while upgrading.

Longer term, I definitely think making things "just work" rather than having 
the user manage protocol versions manually during upgrade is better. Just not 
sure we can actually make that happen immediately for this release.

 

 

 

> Consider options for safer upgrade of offset commit value schema
> 
>
> Key: KAFKA-7481
> URL: https://issues.apache.org/jira/browse/KAFKA-7481
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Priority: Blocker
> Fix For: 2.1.0
>
>
> KIP-211 and KIP-320 add new versions of the offset commit value schema. The 
> use of the new schema version is controlled by the 
> `inter.broker.protocol.version` configuration.  Once the new inter-broker 
> version is in use, it is not possible to downgrade since the older brokers 
> will not be able to parse the new schema. 
> The options at the moment are the following:
> 1. Do nothing. Users can try the new version and keep 
> `inter.broker.protocol.version` locked to the old release. Downgrade will 
> still be possible, but users will not be able to test new capabilities which 
> depend on inter-broker protocol changes.
> 2. Instead of using `inter.broker.protocol.version`, we could use 
> `message.format.version`. This would basically extend the use of this config 
> to apply to all persistent formats. The advantage is that it allows users to 
> upgrade the broker and begin using the new inter-broker protocol while still 
> allowing downgrade. But features which depend on the persistent format could 
> not be tested.
> Any other options?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7313) StopReplicaRequest should attempt to remove future replica for the partition only if future replica exists

2018-10-24 Thread Dong Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin updated KAFKA-7313:

Fix Version/s: 2.1.1

> StopReplicaRequest should attempt to remove future replica for the partition 
> only if future replica exists
> --
>
> Key: KAFKA-7313
> URL: https://issues.apache.org/jira/browse/KAFKA-7313
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Dong Lin
>Priority: Major
> Fix For: 2.1.1
>
>
> This patch fixes two issues:
> 1) Currently if a broker received StopReplicaRequest with delete=true for the 
> same offline replica, the first StopRelicaRequest will show 
> KafkaStorageException and the second StopRelicaRequest will show 
> ReplicaNotAvailableException. This is because the first StopRelicaRequest 
> will remove the mapping (tp -> ReplicaManager.OfflinePartition) from 
> ReplicaManager.allPartitions before returning KafkaStorageException, thus the 
> second StopRelicaRequest will not find this partition as offline.
> This result appears to be inconsistent. And since the replica is already 
> offline and broker will not be able to delete file for this replica, the 
> StopReplicaRequest should fail without making any change and broker should 
> still remember that this replica is offline. 
> 2) Currently if broker receives StopReplicaRequest with delete=true, the 
> broker will attempt to remove future replica for the partition, which will 
> cause KafkaStorageException in the StopReplicaResponse if this replica does 
> not have future replica. It is problematic to always return 
> KafkaStorageException in the response if future replica does not exist.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7539) ConsumerBounceTest.testClose transient failure

2018-10-24 Thread John Roesler (JIRA)
John Roesler created KAFKA-7539:
---

 Summary: ConsumerBounceTest.testClose transient failure
 Key: KAFKA-7539
 URL: https://issues.apache.org/jira/browse/KAFKA-7539
 Project: Kafka
  Issue Type: Sub-task
Reporter: John Roesler
Assignee: Fangmin Lv
 Fix For: 0.9.0.0


{code}
kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures FAILED
java.lang.AssertionError: expected:<1000> but was:<976>
at org.junit.Assert.fail(Assert.java:92)
at org.junit.Assert.failNotEquals(Assert.java:689)
at org.junit.Assert.assertEquals(Assert.java:127)
at org.junit.Assert.assertEquals(Assert.java:514)
at org.junit.Assert.assertEquals(Assert.java:498)
at 
kafka.api.ConsumerBounceTest.seekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:117)
at 
kafka.api.ConsumerBounceTest.testSeekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:98)

kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures FAILED
java.lang.AssertionError: expected:<1000> but was:<913>
at org.junit.Assert.fail(Assert.java:92)
at org.junit.Assert.failNotEquals(Assert.java:689)
at org.junit.Assert.assertEquals(Assert.java:127)
at org.junit.Assert.assertEquals(Assert.java:514)
at org.junit.Assert.assertEquals(Assert.java:498)
at 
kafka.api.ConsumerBounceTest.seekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:117)
at 
kafka.api.ConsumerBounceTest.testSeekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:98)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7535) KafkaConsumer doesn't report records-lag if isolation.level is read_committed

2018-10-24 Thread Dong Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin updated KAFKA-7535:

Fix Version/s: 2.1.0

> KafkaConsumer doesn't report records-lag if isolation.level is read_committed
> -
>
> Key: KAFKA-7535
> URL: https://issues.apache.org/jira/browse/KAFKA-7535
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.0.0
>Reporter: Alexey Vakhrenev
>Assignee: lambdaliu
>Priority: Major
>  Labels: regression
> Fix For: 2.1.0, 2.1.1, 2.0.2
>
>
> Starting from 2.0.0, {{KafkaConsumer}} doesn't report {{records-lag}} if 
> {{isolation.level}} is {{read_committed}}. The last version, where it works 
> is {{1.1.1}}.
> The issue can be easily reproduced in {{kafka.api.PlaintextConsumerTest}} by 
> adding {{consumerConfig.setProperty("isolation.level", "read_committed")}} 
> witin related tests:
>  - {{testPerPartitionLagMetricsCleanUpWithAssign}}
>  - {{testPerPartitionLagMetricsCleanUpWithSubscribe}}
>  - {{testPerPartitionLagWithMaxPollRecords}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7539) ConsumerBounceTest.testClose transient failure

2018-10-24 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-7539:

Description: 
{code:java}
java.lang.ArrayIndexOutOfBoundsException: -1
at 
kafka.integration.KafkaServerTestHarness.killBroker(KafkaServerTestHarness.scala:146)
at 
kafka.api.ConsumerBounceTest.checkCloseWithCoordinatorFailure(ConsumerBounceTest.scala:238)
at kafka.api.ConsumerBounceTest.testClose(ConsumerBounceTest.scala:211)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155)
at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:137)
at 
org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404)
at 
org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63)
at 
org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:46)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 

[jira] [Updated] (KAFKA-7539) ConsumerBounceTest.testClose transient failure

2018-10-24 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-7539:

Labels: system-test-failure  (was: )

> ConsumerBounceTest.testClose transient failure
> --
>
> Key: KAFKA-7539
> URL: https://issues.apache.org/jira/browse/KAFKA-7539
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: John Roesler
>Priority: Major
>
> {code}
> kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures FAILED
> java.lang.AssertionError: expected:<1000> but was:<976>
> at org.junit.Assert.fail(Assert.java:92)
> at org.junit.Assert.failNotEquals(Assert.java:689)
> at org.junit.Assert.assertEquals(Assert.java:127)
> at org.junit.Assert.assertEquals(Assert.java:514)
> at org.junit.Assert.assertEquals(Assert.java:498)
> at 
> kafka.api.ConsumerBounceTest.seekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:117)
> at 
> kafka.api.ConsumerBounceTest.testSeekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:98)
> kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures FAILED
> java.lang.AssertionError: expected:<1000> but was:<913>
> at org.junit.Assert.fail(Assert.java:92)
> at org.junit.Assert.failNotEquals(Assert.java:689)
> at org.junit.Assert.assertEquals(Assert.java:127)
> at org.junit.Assert.assertEquals(Assert.java:514)
> at org.junit.Assert.assertEquals(Assert.java:498)
> at 
> kafka.api.ConsumerBounceTest.seekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:117)
> at 
> kafka.api.ConsumerBounceTest.testSeekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:98)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7539) ConsumerBounceTest.testClose transient failure

2018-10-24 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-7539:

Labels:   (was: newbie)

> ConsumerBounceTest.testClose transient failure
> --
>
> Key: KAFKA-7539
> URL: https://issues.apache.org/jira/browse/KAFKA-7539
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: John Roesler
>Priority: Major
>
> {code}
> kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures FAILED
> java.lang.AssertionError: expected:<1000> but was:<976>
> at org.junit.Assert.fail(Assert.java:92)
> at org.junit.Assert.failNotEquals(Assert.java:689)
> at org.junit.Assert.assertEquals(Assert.java:127)
> at org.junit.Assert.assertEquals(Assert.java:514)
> at org.junit.Assert.assertEquals(Assert.java:498)
> at 
> kafka.api.ConsumerBounceTest.seekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:117)
> at 
> kafka.api.ConsumerBounceTest.testSeekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:98)
> kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures FAILED
> java.lang.AssertionError: expected:<1000> but was:<913>
> at org.junit.Assert.fail(Assert.java:92)
> at org.junit.Assert.failNotEquals(Assert.java:689)
> at org.junit.Assert.assertEquals(Assert.java:127)
> at org.junit.Assert.assertEquals(Assert.java:514)
> at org.junit.Assert.assertEquals(Assert.java:498)
> at 
> kafka.api.ConsumerBounceTest.seekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:117)
> at 
> kafka.api.ConsumerBounceTest.testSeekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:98)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7539) ConsumerBounceTest.testClose transient failure

2018-10-24 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-7539:

Fix Version/s: (was: 0.9.0.0)

> ConsumerBounceTest.testClose transient failure
> --
>
> Key: KAFKA-7539
> URL: https://issues.apache.org/jira/browse/KAFKA-7539
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: John Roesler
>Priority: Major
>
> {code}
> kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures FAILED
> java.lang.AssertionError: expected:<1000> but was:<976>
> at org.junit.Assert.fail(Assert.java:92)
> at org.junit.Assert.failNotEquals(Assert.java:689)
> at org.junit.Assert.assertEquals(Assert.java:127)
> at org.junit.Assert.assertEquals(Assert.java:514)
> at org.junit.Assert.assertEquals(Assert.java:498)
> at 
> kafka.api.ConsumerBounceTest.seekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:117)
> at 
> kafka.api.ConsumerBounceTest.testSeekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:98)
> kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures FAILED
> java.lang.AssertionError: expected:<1000> but was:<913>
> at org.junit.Assert.fail(Assert.java:92)
> at org.junit.Assert.failNotEquals(Assert.java:689)
> at org.junit.Assert.assertEquals(Assert.java:127)
> at org.junit.Assert.assertEquals(Assert.java:514)
> at org.junit.Assert.assertEquals(Assert.java:498)
> at 
> kafka.api.ConsumerBounceTest.seekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:117)
> at 
> kafka.api.ConsumerBounceTest.testSeekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:98)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-7539) ConsumerBounceTest.testClose transient failure

2018-10-24 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler reassigned KAFKA-7539:
---

Assignee: (was: Fangmin Lv)

> ConsumerBounceTest.testClose transient failure
> --
>
> Key: KAFKA-7539
> URL: https://issues.apache.org/jira/browse/KAFKA-7539
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: John Roesler
>Priority: Major
>
> {code}
> kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures FAILED
> java.lang.AssertionError: expected:<1000> but was:<976>
> at org.junit.Assert.fail(Assert.java:92)
> at org.junit.Assert.failNotEquals(Assert.java:689)
> at org.junit.Assert.assertEquals(Assert.java:127)
> at org.junit.Assert.assertEquals(Assert.java:514)
> at org.junit.Assert.assertEquals(Assert.java:498)
> at 
> kafka.api.ConsumerBounceTest.seekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:117)
> at 
> kafka.api.ConsumerBounceTest.testSeekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:98)
> kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures FAILED
> java.lang.AssertionError: expected:<1000> but was:<913>
> at org.junit.Assert.fail(Assert.java:92)
> at org.junit.Assert.failNotEquals(Assert.java:689)
> at org.junit.Assert.assertEquals(Assert.java:127)
> at org.junit.Assert.assertEquals(Assert.java:514)
> at org.junit.Assert.assertEquals(Assert.java:498)
> at 
> kafka.api.ConsumerBounceTest.seekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:117)
> at 
> kafka.api.ConsumerBounceTest.testSeekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:98)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7539) ConsumerBounceTest.testClose transient failure

2018-10-24 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-7539:

Labels:   (was: system-test-failure)

> ConsumerBounceTest.testClose transient failure
> --
>
> Key: KAFKA-7539
> URL: https://issues.apache.org/jira/browse/KAFKA-7539
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: John Roesler
>Priority: Major
>
> {code}
> kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures FAILED
> java.lang.AssertionError: expected:<1000> but was:<976>
> at org.junit.Assert.fail(Assert.java:92)
> at org.junit.Assert.failNotEquals(Assert.java:689)
> at org.junit.Assert.assertEquals(Assert.java:127)
> at org.junit.Assert.assertEquals(Assert.java:514)
> at org.junit.Assert.assertEquals(Assert.java:498)
> at 
> kafka.api.ConsumerBounceTest.seekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:117)
> at 
> kafka.api.ConsumerBounceTest.testSeekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:98)
> kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures FAILED
> java.lang.AssertionError: expected:<1000> but was:<913>
> at org.junit.Assert.fail(Assert.java:92)
> at org.junit.Assert.failNotEquals(Assert.java:689)
> at org.junit.Assert.assertEquals(Assert.java:127)
> at org.junit.Assert.assertEquals(Assert.java:514)
> at org.junit.Assert.assertEquals(Assert.java:498)
> at 
> kafka.api.ConsumerBounceTest.seekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:117)
> at 
> kafka.api.ConsumerBounceTest.testSeekAndCommitWithBrokerFailures(ConsumerBounceTest.scala:98)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7541) Transient Failure: kafka.server.DynamicBrokerReconfigurationTest.testUncleanLeaderElectionEnable

2018-10-24 Thread John Roesler (JIRA)
John Roesler created KAFKA-7541:
---

 Summary: Transient Failure: 
kafka.server.DynamicBrokerReconfigurationTest.testUncleanLeaderElectionEnable
 Key: KAFKA-7541
 URL: https://issues.apache.org/jira/browse/KAFKA-7541
 Project: Kafka
  Issue Type: Bug
Reporter: John Roesler


Observed on Java 11: 
[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/264/testReport/junit/kafka.server/DynamicBrokerReconfigurationTest/testUncleanLeaderElectionEnable/]

 

Stacktrace:
{noformat}
java.lang.AssertionError: Unclean leader not elected
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
kafka.server.DynamicBrokerReconfigurationTest.testUncleanLeaderElectionEnable(DynamicBrokerReconfigurationTest.scala:487)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 

[jira] [Updated] (KAFKA-7540) Transient failure: kafka.api.ConsumerBounceTest.testClose

2018-10-24 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-7540:

Labels: flaky-test  (was: )

> Transient failure: kafka.api.ConsumerBounceTest.testClose
> -
>
> Key: KAFKA-7540
> URL: https://issues.apache.org/jira/browse/KAFKA-7540
> Project: Kafka
>  Issue Type: Bug
>Reporter: John Roesler
>Priority: Major
>  Labels: flaky-test
>
> Observed on Java 8: 
> [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/17314/testReport/junit/kafka.api/ConsumerBounceTest/testClose/]
>  
> Stacktrace:
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: -1
>   at 
> kafka.integration.KafkaServerTestHarness.killBroker(KafkaServerTestHarness.scala:146)
>   at 
> kafka.api.ConsumerBounceTest.checkCloseWithCoordinatorFailure(ConsumerBounceTest.scala:238)
>   at kafka.api.ConsumerBounceTest.testClose(ConsumerBounceTest.scala:211)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
>   at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155)
>   at 
> 

[jira] [Commented] (KAFKA-7535) KafkaConsumer doesn't report records-lag if isolation.level is read_committed

2018-10-24 Thread Dong Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662916#comment-16662916
 ] 

Dong Lin commented on KAFKA-7535:
-

[~ijuma] Thanks for the notice. Yeah I would like to have this issue fixed in 
2.1.0.

> KafkaConsumer doesn't report records-lag if isolation.level is read_committed
> -
>
> Key: KAFKA-7535
> URL: https://issues.apache.org/jira/browse/KAFKA-7535
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.0.0
>Reporter: Alexey Vakhrenev
>Assignee: lambdaliu
>Priority: Major
>  Labels: regression
> Fix For: 2.1.0, 2.1.1, 2.0.2
>
>
> Starting from 2.0.0, {{KafkaConsumer}} doesn't report {{records-lag}} if 
> {{isolation.level}} is {{read_committed}}. The last version, where it works 
> is {{1.1.1}}.
> The issue can be easily reproduced in {{kafka.api.PlaintextConsumerTest}} by 
> adding {{consumerConfig.setProperty("isolation.level", "read_committed")}} 
> witin related tests:
>  - {{testPerPartitionLagMetricsCleanUpWithAssign}}
>  - {{testPerPartitionLagMetricsCleanUpWithSubscribe}}
>  - {{testPerPartitionLagWithMaxPollRecords}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7539) ConsumerBounceTest.testClose transient failure

2018-10-24 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler resolved KAFKA-7539.
-
Resolution: Invalid
  Reviewer:   (was: Jason Gustafson)

Cloning to create this issue was a mistake. I'm going to create a fresh one.

> ConsumerBounceTest.testClose transient failure
> --
>
> Key: KAFKA-7539
> URL: https://issues.apache.org/jira/browse/KAFKA-7539
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: John Roesler
>Priority: Major
>  Labels: flaky-test
>
> {code:java}
> java.lang.ArrayIndexOutOfBoundsException: -1
>   at 
> kafka.integration.KafkaServerTestHarness.killBroker(KafkaServerTestHarness.scala:146)
>   at 
> kafka.api.ConsumerBounceTest.checkCloseWithCoordinatorFailure(ConsumerBounceTest.scala:238)
>   at kafka.api.ConsumerBounceTest.testClose(ConsumerBounceTest.scala:211)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
>   at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155)
>   at 
> 

[jira] [Updated] (KAFKA-7539) ConsumerBounceTest.testClose transient failure

2018-10-24 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-7539:

Labels: flaky-test  (was: )

> ConsumerBounceTest.testClose transient failure
> --
>
> Key: KAFKA-7539
> URL: https://issues.apache.org/jira/browse/KAFKA-7539
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: John Roesler
>Priority: Major
>  Labels: flaky-test
>
> {code:java}
> java.lang.ArrayIndexOutOfBoundsException: -1
>   at 
> kafka.integration.KafkaServerTestHarness.killBroker(KafkaServerTestHarness.scala:146)
>   at 
> kafka.api.ConsumerBounceTest.checkCloseWithCoordinatorFailure(ConsumerBounceTest.scala:238)
>   at kafka.api.ConsumerBounceTest.testClose(ConsumerBounceTest.scala:211)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
>   at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155)
>   at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:137)
>   at 
> 

[jira] [Created] (KAFKA-7543) Kafka Connect JDBC Sink failing to establish connection to Topic, while the connection is working fine with standalone consumer

2018-10-24 Thread Kashyap Ivaturi (JIRA)
Kashyap Ivaturi created KAFKA-7543:
--

 Summary: Kafka Connect JDBC Sink failing to establish connection 
to Topic, while the connection is working fine with standalone consumer
 Key: KAFKA-7543
 URL: https://issues.apache.org/jira/browse/KAFKA-7543
 Project: Kafka
  Issue Type: Task
  Components: KafkaConnect
Reporter: Kashyap Ivaturi


Hi,

I'am trying to build Kafka Connect JDBC Sink Connector to have my DB updated 
with the data I get in Kafka Topic. I had implemented JDBC Source Connectors 
before which worked very well but in this case when I try to run the Sink 
Connector its internally failing to connect to the Topic and disconnecting from 
the Kafka broker and this is happening in loop. When I have enabled TRACE I got 
below details in the log. Any idea why the consumer is unable to connect to the 
Topic?. Actually when I have used a standalone consumer from my another 
application it worked pretty well in connecting to the Topic and reading 
messages from it. Please let me know if you have any suggestions.

 

[2018-10-24 23:03:24,134] INFO WorkerSinkTask\{id=hrmsAckEvents-0} Sink task 
finished initialization and start 
(org.apache.kafka.connect.runtime.WorkerSinkTask:268)

[2018-10-24 23:03:24,135] TRACE WorkerSinkTask\{id=hrmsAckEvents-0} Polling 
consumer with timeout 4875 ms 
(org.apache.kafka.connect.runtime.WorkerSinkTask:282)

[2018-10-24 23:03:24,136] TRACE [Consumer clientId=consumer-1, groupId=hrmsack] 
Found least loaded node messaging-rtp3.cisco.com:9093 (id: -1 rack: null) 
(org.apache.kafka.clients.NetworkClient:123)

[2018-10-24 23:03:24,136] DEBUG [Consumer clientId=consumer-1, groupId=hrmsack] 
Sending GroupCoordinator request to broker messaging-rtp3.cisco.com:9093 (id: 
-1 rack: null) 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:183)

[2018-10-24 23:03:24,281] DEBUG [Consumer clientId=consumer-1, groupId=hrmsack] 
Initiating connection to node messaging-rtp3.cisco.com:9093 (id: -1 rack: null) 
(org.apache.kafka.clients.NetworkClient:183)

[2018-10-24 23:03:24,293] TRACE [Consumer clientId=consumer-1, groupId=hrmsack] 
Found least loaded node messaging-rtp3.cisco.com:9093 (id: -1 rack: null) 
(org.apache.kafka.clients.NetworkClient:123)

[2018-10-24 23:03:24,295] TRACE [Consumer clientId=consumer-1, groupId=hrmsack] 
Found least loaded node messaging-rtp3.cisco.com:9093 (id: -1 rack: null) 
(org.apache.kafka.clients.NetworkClient:123)

[2018-10-24 23:03:24,346] TRACE [Consumer clientId=consumer-1, groupId=hrmsack] 
Found least loaded node messaging-rtp3.cisco.com:9093 (id: -1 rack: null) 
(org.apache.kafka.clients.NetworkClient:123)

[2018-10-24 23:03:24,365] DEBUG Added sensor with name node--1.bytes-sent 
(org.apache.kafka.common.metrics.Metrics:404)

[2018-10-24 23:03:24,367] DEBUG Added sensor with name node--1.bytes-received 
(org.apache.kafka.common.metrics.Metrics:404)

[2018-10-24 23:03:24,374] DEBUG Added sensor with name node--1.latency 
(org.apache.kafka.common.metrics.Metrics:404)

[2018-10-24 23:03:24,376] DEBUG [Consumer clientId=consumer-1, groupId=hrmsack] 
Created socket with SO_RCVBUF = 65536, SO_SNDBUF = 131072, SO_TIMEOUT = 0 to 
node -1 (org.apache.kafka.common.network.Selector:195)

[2018-10-24 23:03:24,377] DEBUG [Consumer clientId=consumer-1, groupId=hrmsack] 
Completed connection to node -1. Fetching API versions. 
(org.apache.kafka.clients.NetworkClient:183)

[2018-10-24 23:03:24,377] DEBUG [Consumer clientId=consumer-1, groupId=hrmsack] 
Initiating API versions fetch from node -1. 
(org.apache.kafka.clients.NetworkClient:183)

[2018-10-24 23:03:24,378] TRACE [Consumer clientId=consumer-1, groupId=hrmsack] 
No version information found when sending API_VERSIONS with correlation id 1 to 
node -1. Assuming version 1. (org.apache.kafka.clients.NetworkClient:135)

[2018-10-24 23:03:24,380] TRACE [Consumer clientId=consumer-1, groupId=hrmsack] 
Sending API_VERSIONS {} with correlation id 1 to node -1 
(org.apache.kafka.clients.NetworkClient:135)

[2018-10-24 23:03:24,385] TRACE [Consumer clientId=consumer-1, groupId=hrmsack] 
Found least loaded node messaging-rtp3.cisco.com:9093 (id: -1 rack: null) 
(org.apache.kafka.clients.NetworkClient:123)

[2018-10-24 23:03:24,389] TRACE [Consumer clientId=consumer-1, groupId=hrmsack] 
Found least loaded node messaging-rtp3.cisco.com:9093 (id: -1 rack: null) 
(org.apache.kafka.clients.NetworkClient:123)

[2018-10-24 23:03:24,724] TRACE [Consumer clientId=consumer-1, groupId=hrmsack] 
Found least loaded node messaging-rtp3.cisco.com:9093 (id: -1 rack: null) 
(org.apache.kafka.clients.NetworkClient:123)

[2018-10-24 23:03:24,725] DEBUG [Consumer clientId=consumer-1, groupId=hrmsack] 
Connection with messaging-rtp3.cisco.com/64.101.96.6 disconnected 
(org.apache.kafka.common.network.Selector:189)

java.io.EOFException

at 

[jira] [Created] (KAFKA-7544) Transient Failure: org.apache.kafka.streams.integration.EosIntegrationTest.shouldNotViolateEosIfOneTaskFails

2018-10-24 Thread John Roesler (JIRA)
John Roesler created KAFKA-7544:
---

 Summary: Transient Failure: 
org.apache.kafka.streams.integration.EosIntegrationTest.shouldNotViolateEosIfOneTaskFails
 Key: KAFKA-7544
 URL: https://issues.apache.org/jira/browse/KAFKA-7544
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: John Roesler


Observed on Java 11: 
[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/264/testReport/junit/org.apache.kafka.streams.integration/EosIntegrationTest/shouldNotViolateEosIfOneTaskFails/]

at 
[https://github.com/apache/kafka/pull/5795/commits/5bdcd0e023c6f406d585155399f6541bb6a9f9c2]

 

stacktrace:
{noformat}
java.lang.AssertionError: 
Expected: <[KeyValue(0, 0), KeyValue(0, 1), KeyValue(0, 2), KeyValue(0, 3), 
KeyValue(0, 4), KeyValue(0, 5), KeyValue(0, 6), KeyValue(0, 7), KeyValue(0, 8), 
KeyValue(0, 9)]>
 but: was <[KeyValue(0, 0), KeyValue(0, 1), KeyValue(0, 2), KeyValue(0, 3), 
KeyValue(0, 4), KeyValue(0, 5), KeyValue(0, 6), KeyValue(0, 7), KeyValue(0, 8), 
KeyValue(0, 9), KeyValue(0, 10), KeyValue(0, 11), KeyValue(0, 12), KeyValue(0, 
13), KeyValue(0, 14)]>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
at 
org.apache.kafka.streams.integration.EosIntegrationTest.checkResultPerKey(EosIntegrationTest.java:218)
at 
org.apache.kafka.streams.integration.EosIntegrationTest.shouldNotViolateEosIfOneTaskFails(EosIntegrationTest.java:360)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
at 

[jira] [Updated] (KAFKA-7539) [deleted] ConsumerBounceTest.testClose transient failure

2018-10-24 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-7539:

Summary: [deleted] ConsumerBounceTest.testClose transient failure  (was: 
ConsumerBounceTest.testClose transient failure)

> [deleted] ConsumerBounceTest.testClose transient failure
> 
>
> Key: KAFKA-7539
> URL: https://issues.apache.org/jira/browse/KAFKA-7539
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: John Roesler
>Priority: Major
>  Labels: flaky-test
>
> {code:java}
> java.lang.ArrayIndexOutOfBoundsException: -1
>   at 
> kafka.integration.KafkaServerTestHarness.killBroker(KafkaServerTestHarness.scala:146)
>   at 
> kafka.api.ConsumerBounceTest.checkCloseWithCoordinatorFailure(ConsumerBounceTest.scala:238)
>   at kafka.api.ConsumerBounceTest.testClose(ConsumerBounceTest.scala:211)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
>   at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155)
>   at 
> 

[jira] [Created] (KAFKA-7540) Transient failure: kafka.api.ConsumerBounceTest.testClose

2018-10-24 Thread John Roesler (JIRA)
John Roesler created KAFKA-7540:
---

 Summary: Transient failure: kafka.api.ConsumerBounceTest.testClose
 Key: KAFKA-7540
 URL: https://issues.apache.org/jira/browse/KAFKA-7540
 Project: Kafka
  Issue Type: Bug
Reporter: John Roesler


Observed on Java 8: 
[https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/17314/testReport/junit/kafka.api/ConsumerBounceTest/testClose/]

 

Stacktrace:
{noformat}
java.lang.ArrayIndexOutOfBoundsException: -1
at 
kafka.integration.KafkaServerTestHarness.killBroker(KafkaServerTestHarness.scala:146)
at 
kafka.api.ConsumerBounceTest.checkCloseWithCoordinatorFailure(ConsumerBounceTest.scala:238)
at kafka.api.ConsumerBounceTest.testClose(ConsumerBounceTest.scala:211)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155)
at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:137)
at 
org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404)
at 

[jira] [Closed] (KAFKA-7539) [deleted] ConsumerBounceTest.testClose transient failure

2018-10-24 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler closed KAFKA-7539.
---

> [deleted] ConsumerBounceTest.testClose transient failure
> 
>
> Key: KAFKA-7539
> URL: https://issues.apache.org/jira/browse/KAFKA-7539
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: John Roesler
>Priority: Major
>  Labels: flaky-test
>
> {code:java}
> java.lang.ArrayIndexOutOfBoundsException: -1
>   at 
> kafka.integration.KafkaServerTestHarness.killBroker(KafkaServerTestHarness.scala:146)
>   at 
> kafka.api.ConsumerBounceTest.checkCloseWithCoordinatorFailure(ConsumerBounceTest.scala:238)
>   at kafka.api.ConsumerBounceTest.testClose(ConsumerBounceTest.scala:211)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
>   at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155)
>   at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:137)
>   at 
> 

[jira] [Updated] (KAFKA-7542) Transient Failure: kafka.server.GssapiAuthenticationTest.testServerAuthenticationFailure

2018-10-24 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-7542:

Labels: flaky-test  (was: )

> Transient Failure: 
> kafka.server.GssapiAuthenticationTest.testServerAuthenticationFailure
> 
>
> Key: KAFKA-7542
> URL: https://issues.apache.org/jira/browse/KAFKA-7542
> Project: Kafka
>  Issue Type: Bug
>Reporter: John Roesler
>Priority: Major
>  Labels: flaky-test
>
> Observed on java 11: 
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/264/testReport/junit/kafka.server/GssapiAuthenticationTest/testServerAuthenticationFailure/]
> at 
> [https://github.com/apache/kafka/pull/5795/commits/5bdcd0e023c6f406d585155399f6541bb6a9f9c2]
>  
> Stacktrace:
> {noformat}
> org.scalatest.junit.JUnitTestFailedError: 
>   at 
> org.scalatest.junit.AssertionsForJUnit$class.newAssertionFailedException(AssertionsForJUnit.scala:100)
>   at 
> org.scalatest.junit.JUnitSuite.newAssertionFailedException(JUnitSuite.scala:71)
>   at org.scalatest.Assertions$class.fail(Assertions.scala:1075)
>   at org.scalatest.junit.JUnitSuite.fail(JUnitSuite.scala:71)
>   at 
> kafka.server.GssapiAuthenticationTest.testServerAuthenticationFailure(GssapiAuthenticationTest.scala:128)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> 

[jira] [Created] (KAFKA-7542) Transient Failure: kafka.server.GssapiAuthenticationTest.testServerAuthenticationFailure

2018-10-24 Thread John Roesler (JIRA)
John Roesler created KAFKA-7542:
---

 Summary: Transient Failure: 
kafka.server.GssapiAuthenticationTest.testServerAuthenticationFailure
 Key: KAFKA-7542
 URL: https://issues.apache.org/jira/browse/KAFKA-7542
 Project: Kafka
  Issue Type: Bug
Reporter: John Roesler


Observed on java 11: 
[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/264/testReport/junit/kafka.server/GssapiAuthenticationTest/testServerAuthenticationFailure/]

at 
[https://github.com/apache/kafka/pull/5795/commits/5bdcd0e023c6f406d585155399f6541bb6a9f9c2]

 

Stacktrace:
{noformat}
org.scalatest.junit.JUnitTestFailedError: 
at 
org.scalatest.junit.AssertionsForJUnit$class.newAssertionFailedException(AssertionsForJUnit.scala:100)
at 
org.scalatest.junit.JUnitSuite.newAssertionFailedException(JUnitSuite.scala:71)
at org.scalatest.Assertions$class.fail(Assertions.scala:1075)
at org.scalatest.junit.JUnitSuite.fail(JUnitSuite.scala:71)
at 
kafka.server.GssapiAuthenticationTest.testServerAuthenticationFailure(GssapiAuthenticationTest.scala:128)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at 

[jira] [Created] (KAFKA-7545) Use auth_to_local rules from krb5.conf

2018-10-24 Thread Pradeep Bansal (JIRA)
Pradeep Bansal created KAFKA-7545:
-

 Summary: Use auth_to_local rules from krb5.conf
 Key: KAFKA-7545
 URL: https://issues.apache.org/jira/browse/KAFKA-7545
 Project: Kafka
  Issue Type: Improvement
  Components: security
Reporter: Pradeep Bansal


Currently I have to replicate all auth_to_local rules from my krb5.conf and 
pass it to sasl.kerberos.principal.to.local.rules to make them work. This is 
causing maintenance issue.

 

It will be very helpful/useful if kafka can read auth_to_local rules from 
krb5.conf directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7481) Consider options for safer upgrade of offset commit value schema

2018-10-24 Thread Dong Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16663274#comment-16663274
 ] 

Dong Lin commented on KAFKA-7481:
-

Hey [~ewencp], regarding the change for 2.1.0 release, say we don't make 
further code change for this issue, I agree it is good to clarify in the 
upgrade note that inter.broker.protocol.version can not be reverted after it is 
bumped to 2.1.0.

Regarding the short term solution, I also prefer not to make big code change to 
e.g. use KIP-35 idea to solve the issue here. I would prefer to just clarify in 
the upgrade note that inter.broker.protocol.version can not be reverted after 
it is bumped to 2.1.0. Also, since we currently do not mention anything about 
downgrade in the upgrade note, and the other config log.message.format.version 
can not be downgraded, I am not sure user actually expect to be able to 
downgrade the inter.broker.protocol. So I feel this short term solution is OK 
and strictly speaking it does not break any semantic guarantee.

Regarding the long term solution, it seems that we actually want user to 
manually manage the protocol version config in order to pickup any new feature 
that can change the data format on disk. Otherwise, say we always make things 
work with one rolling bounce, then whenever there is feature that change data 
format on disk, we will have to bump up the major version for the next Kafka 
release to indicate that the version can not be downgraded, which delays the 
acceptance for the release. Also, if we automatically bump up the 
message.format.version for the new broker version, the broker performance will 
downgrade so much because user wont' even have time upgrade client library 
version for most users in the organization.

 

> Consider options for safer upgrade of offset commit value schema
> 
>
> Key: KAFKA-7481
> URL: https://issues.apache.org/jira/browse/KAFKA-7481
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Priority: Blocker
> Fix For: 2.1.0
>
>
> KIP-211 and KIP-320 add new versions of the offset commit value schema. The 
> use of the new schema version is controlled by the 
> `inter.broker.protocol.version` configuration.  Once the new inter-broker 
> version is in use, it is not possible to downgrade since the older brokers 
> will not be able to parse the new schema. 
> The options at the moment are the following:
> 1. Do nothing. Users can try the new version and keep 
> `inter.broker.protocol.version` locked to the old release. Downgrade will 
> still be possible, but users will not be able to test new capabilities which 
> depend on inter-broker protocol changes.
> 2. Instead of using `inter.broker.protocol.version`, we could use 
> `message.format.version`. This would basically extend the use of this config 
> to apply to all persistent formats. The advantage is that it allows users to 
> upgrade the broker and begin using the new inter-broker protocol while still 
> allowing downgrade. But features which depend on the persistent format could 
> not be tested.
> Any other options?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7535) KafkaConsumer doesn't report records-lag if isolation.level is read_committed

2018-10-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661786#comment-16661786
 ] 

ASF GitHub Bot commented on KAFKA-7535:
---

lambdaliu opened a new pull request #5835: KAFKA-7535:KafkaConsumer doesn't 
report records-lag if isolation.level is read_committed
URL: https://github.com/apache/kafka/pull/5835
 
 
   FetchResponse should return the partitionData's lastStabeleOffset when there 
is no need to convert data
   
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> KafkaConsumer doesn't report records-lag if isolation.level is read_committed
> -
>
> Key: KAFKA-7535
> URL: https://issues.apache.org/jira/browse/KAFKA-7535
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.0.0
>Reporter: Alexey Vakhrenev
>Assignee: lambdaliu
>Priority: Major
>
> Starting from 2.0.0, {{KafkaConsumer}} doesn't report {{records-lag}} if 
> {{isolation.level}} is {{read_committed}}. The last version, where it works 
> is {{1.1.1}}.
> The issue can be easily reproduced in {{kafka.api.PlaintextConsumerTest}} by 
> adding {{consumerConfig.setProperty("isolation.level", "read_committed")}} 
> witin related tests:
>  - {{testPerPartitionLagMetricsCleanUpWithAssign}}
>  - {{testPerPartitionLagMetricsCleanUpWithSubscribe}}
>  - {{testPerPartitionLagWithMaxPollRecords}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7535) KafkaConsumer doesn't report records-lag if isolation.level is read_committed

2018-10-24 Thread lambdaliu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661789#comment-16661789
 ] 

lambdaliu commented on KAFKA-7535:
--

bug introduced from this [PR|https://github.com/apache/kafka/pull/5192]

> KafkaConsumer doesn't report records-lag if isolation.level is read_committed
> -
>
> Key: KAFKA-7535
> URL: https://issues.apache.org/jira/browse/KAFKA-7535
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.0.0
>Reporter: Alexey Vakhrenev
>Assignee: lambdaliu
>Priority: Major
>
> Starting from 2.0.0, {{KafkaConsumer}} doesn't report {{records-lag}} if 
> {{isolation.level}} is {{read_committed}}. The last version, where it works 
> is {{1.1.1}}.
> The issue can be easily reproduced in {{kafka.api.PlaintextConsumerTest}} by 
> adding {{consumerConfig.setProperty("isolation.level", "read_committed")}} 
> witin related tests:
>  - {{testPerPartitionLagMetricsCleanUpWithAssign}}
>  - {{testPerPartitionLagMetricsCleanUpWithSubscribe}}
>  - {{testPerPartitionLagWithMaxPollRecords}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7481) Consider options for safer upgrade of offset commit value schema

2018-10-24 Thread Dong Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin updated KAFKA-7481:

Fix Version/s: 2.1.0

> Consider options for safer upgrade of offset commit value schema
> 
>
> Key: KAFKA-7481
> URL: https://issues.apache.org/jira/browse/KAFKA-7481
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Priority: Blocker
> Fix For: 2.1.0
>
>
> KIP-211 and KIP-320 add new versions of the offset commit value schema. The 
> use of the new schema version is controlled by the 
> `inter.broker.protocol.version` configuration.  Once the new inter-broker 
> version is in use, it is not possible to downgrade since the older brokers 
> will not be able to parse the new schema. 
> The options at the moment are the following:
> 1. Do nothing. Users can try the new version and keep 
> `inter.broker.protocol.version` locked to the old release. Downgrade will 
> still be possible, but users will not be able to test new capabilities which 
> depend on inter-broker protocol changes.
> 2. Instead of using `inter.broker.protocol.version`, we could use 
> `message.format.version`. This would basically extend the use of this config 
> to apply to all persistent formats. The advantage is that it allows users to 
> upgrade the broker and begin using the new inter-broker protocol while still 
> allowing downgrade. But features which depend on the persistent format could 
> not be tested.
> Any other options?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7481) Consider options for safer upgrade of offset commit value schema

2018-10-24 Thread Dong Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin updated KAFKA-7481:

Priority: Critical  (was: Blocker)

> Consider options for safer upgrade of offset commit value schema
> 
>
> Key: KAFKA-7481
> URL: https://issues.apache.org/jira/browse/KAFKA-7481
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Priority: Critical
> Fix For: 2.1.0
>
>
> KIP-211 and KIP-320 add new versions of the offset commit value schema. The 
> use of the new schema version is controlled by the 
> `inter.broker.protocol.version` configuration.  Once the new inter-broker 
> version is in use, it is not possible to downgrade since the older brokers 
> will not be able to parse the new schema. 
> The options at the moment are the following:
> 1. Do nothing. Users can try the new version and keep 
> `inter.broker.protocol.version` locked to the old release. Downgrade will 
> still be possible, but users will not be able to test new capabilities which 
> depend on inter-broker protocol changes.
> 2. Instead of using `inter.broker.protocol.version`, we could use 
> `message.format.version`. This would basically extend the use of this config 
> to apply to all persistent formats. The advantage is that it allows users to 
> upgrade the broker and begin using the new inter-broker protocol while still 
> allowing downgrade. But features which depend on the persistent format could 
> not be tested.
> Any other options?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7165) Error while creating ephemeral at /brokers/ids/BROKER_ID

2018-10-24 Thread Jonathan Santilli (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661931#comment-16661931
 ] 

Jonathan Santilli commented on KAFKA-7165:
--

Hello Juan Rao, I would like to continue working on this bug, hope you can have 
some time to elaborate a little bit more your proposal from 
[https://github.com/apache/kafka/pull/5575#issuecomment-416419017:]

 
{noformat}
An alternative approach is to retry the creation of the ephemeral node up to 
sth like twice the session timeout. It may take a bit long for the broker to be 
re-registered. However, it seems it's a bit safer and simpler, until 
ZOOKEEPER-2985 is fixed.{noformat}
 

Cheers,

--

Jonathan

> Error while creating ephemeral at /brokers/ids/BROKER_ID
> 
>
> Key: KAFKA-7165
> URL: https://issues.apache.org/jira/browse/KAFKA-7165
> Project: Kafka
>  Issue Type: Bug
>  Components: core, zkclient
>Affects Versions: 1.1.0
>Reporter: Jonathan Santilli
>Assignee: Jonathan Santilli
>Priority: Major
>
> Kafka version: 1.1.0
> Zookeeper version: 3.4.12
> 4 Kafka Brokers
> 4 Zookeeper servers
>  
> In one of the 4 brokers of the cluster, we detect the following error:
> [2018-07-14 04:38:23,784] INFO Unable to read additional data from server 
> sessionid 0x3000c2420cb458d, likely server has closed socket, closing socket 
> connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:24,509] INFO Opening socket connection to server 
> *ZOOKEEPER_SERVER_1:PORT*. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:24,510] INFO Socket connection established to 
> *ZOOKEEPER_SERVER_1:PORT*, initiating session 
> (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:24,513] INFO Unable to read additional data from server 
> sessionid 0x3000c2420cb458d, likely server has closed socket, closing socket 
> connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:25,287] INFO Opening socket connection to server 
> *ZOOKEEPER_SERVER_2:PORT*. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:25,287] INFO Socket connection established to 
> *ZOOKEEPER_SERVER_2:PORT*, initiating session 
> (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:25,954] INFO [Partition TOPIC_NAME-PARTITION-# broker=1|#* 
> broker=1] Shrinking ISR from 1,3,4,2 to 1,4,2 (kafka.cluster.Partition)
>  [2018-07-14 04:38:26,444] WARN Unable to reconnect to ZooKeeper service, 
> session 0x3000c2420cb458d has expired (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:26,444] INFO Unable to reconnect to ZooKeeper service, 
> session 0x3000c2420cb458d has expired, closing socket connection 
> (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:26,445] INFO EventThread shut down for session: 
> 0x3000c2420cb458d (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:26,446] INFO [ZooKeeperClient] Session expired. 
> (kafka.zookeeper.ZooKeeperClient)
>  [2018-07-14 04:38:26,459] INFO [ZooKeeperClient] Initializing a new session 
> to 
> *ZOOKEEPER_SERVER_1:PORT*,*ZOOKEEPER_SERVER_2:PORT*,*ZOOKEEPER_SERVER_3:PORT*,*ZOOKEEPER_SERVER_4:PORT*.
>  (kafka.zookeeper.ZooKeeperClient)
>  [2018-07-14 04:38:26,459] INFO Initiating client connection, 
> connectString=*ZOOKEEPER_SERVER_1:PORT*,*ZOOKEEPER_SERVER_2:PORT*,*ZOOKEEPER_SERVER_3:PORT*,*ZOOKEEPER_SERVER_4:PORT*
>  sessionTimeout=6000 
> watcher=kafka.zookeeper.ZooKeeperClient$ZooKeeperClientWatcher$@44821a96 
> (org.apache.zookeeper.ZooKeeper)
>  [2018-07-14 04:38:26,465] INFO Opening socket connection to server 
> *ZOOKEEPER_SERVER_1:PORT*. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:26,477] INFO Socket connection established to 
> *ZOOKEEPER_SERVER_1:PORT*, initiating session 
> (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:26,484] INFO Session establishment complete on server 
> *ZOOKEEPER_SERVER_1:PORT*, sessionid = 0x4005b59eb6a, negotiated timeout 
> = 6000 (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:26,496] *INFO Creating /brokers/ids/1* (is it secure? 
> false) (kafka.zk.KafkaZkClient)
>  [2018-07-14 04:38:26,500] INFO Processing notification(s) to /config/changes 
> (kafka.common.ZkNodeChangeNotificationListener)
>  *[2018-07-14 04:38:26,547] ERROR Error while creating ephemeral at 
> /brokers/ids/1, node already exists and owner '216186131422332301' does not 
> match current session '288330817911521280' 
> (kafka.zk.KafkaZkClient$CheckedEphemeral)*
>  [2018-07-14 04:38:26,547] *INFO Result of znode creation at /brokers/ids/1 
> is: NODEEXISTS* (kafka.zk.KafkaZkClient)
>  [2018-07-14 04:38:26,559] 

[jira] [Comment Edited] (KAFKA-7165) Error while creating ephemeral at /brokers/ids/BROKER_ID

2018-10-24 Thread Jonathan Santilli (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661931#comment-16661931
 ] 

Jonathan Santilli edited comment on KAFKA-7165 at 10/24/18 8:31 AM:


Hello Jun Rao, I would like to continue working on this bug, hope you can have 
some time to elaborate a little bit more your proposal from 
[https://github.com/apache/kafka/pull/5575#issuecomment-416419017:]

 
{noformat}
An alternative approach is to retry the creation of the ephemeral node up to 
sth like twice the session timeout. It may take a bit long for the broker to be 
re-registered. However, it seems it's a bit safer and simpler, until 
ZOOKEEPER-2985 is fixed.{noformat}
 

Cheers,

–

Jonathan


was (Author: pachilo):
Hello Juan Rao, I would like to continue working on this bug, hope you can have 
some time to elaborate a little bit more your proposal from 
[https://github.com/apache/kafka/pull/5575#issuecomment-416419017:]

 
{noformat}
An alternative approach is to retry the creation of the ephemeral node up to 
sth like twice the session timeout. It may take a bit long for the broker to be 
re-registered. However, it seems it's a bit safer and simpler, until 
ZOOKEEPER-2985 is fixed.{noformat}
 

Cheers,

--

Jonathan

> Error while creating ephemeral at /brokers/ids/BROKER_ID
> 
>
> Key: KAFKA-7165
> URL: https://issues.apache.org/jira/browse/KAFKA-7165
> Project: Kafka
>  Issue Type: Bug
>  Components: core, zkclient
>Affects Versions: 1.1.0
>Reporter: Jonathan Santilli
>Assignee: Jonathan Santilli
>Priority: Major
>
> Kafka version: 1.1.0
> Zookeeper version: 3.4.12
> 4 Kafka Brokers
> 4 Zookeeper servers
>  
> In one of the 4 brokers of the cluster, we detect the following error:
> [2018-07-14 04:38:23,784] INFO Unable to read additional data from server 
> sessionid 0x3000c2420cb458d, likely server has closed socket, closing socket 
> connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:24,509] INFO Opening socket connection to server 
> *ZOOKEEPER_SERVER_1:PORT*. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:24,510] INFO Socket connection established to 
> *ZOOKEEPER_SERVER_1:PORT*, initiating session 
> (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:24,513] INFO Unable to read additional data from server 
> sessionid 0x3000c2420cb458d, likely server has closed socket, closing socket 
> connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:25,287] INFO Opening socket connection to server 
> *ZOOKEEPER_SERVER_2:PORT*. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:25,287] INFO Socket connection established to 
> *ZOOKEEPER_SERVER_2:PORT*, initiating session 
> (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:25,954] INFO [Partition TOPIC_NAME-PARTITION-# broker=1|#* 
> broker=1] Shrinking ISR from 1,3,4,2 to 1,4,2 (kafka.cluster.Partition)
>  [2018-07-14 04:38:26,444] WARN Unable to reconnect to ZooKeeper service, 
> session 0x3000c2420cb458d has expired (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:26,444] INFO Unable to reconnect to ZooKeeper service, 
> session 0x3000c2420cb458d has expired, closing socket connection 
> (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:26,445] INFO EventThread shut down for session: 
> 0x3000c2420cb458d (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:26,446] INFO [ZooKeeperClient] Session expired. 
> (kafka.zookeeper.ZooKeeperClient)
>  [2018-07-14 04:38:26,459] INFO [ZooKeeperClient] Initializing a new session 
> to 
> *ZOOKEEPER_SERVER_1:PORT*,*ZOOKEEPER_SERVER_2:PORT*,*ZOOKEEPER_SERVER_3:PORT*,*ZOOKEEPER_SERVER_4:PORT*.
>  (kafka.zookeeper.ZooKeeperClient)
>  [2018-07-14 04:38:26,459] INFO Initiating client connection, 
> connectString=*ZOOKEEPER_SERVER_1:PORT*,*ZOOKEEPER_SERVER_2:PORT*,*ZOOKEEPER_SERVER_3:PORT*,*ZOOKEEPER_SERVER_4:PORT*
>  sessionTimeout=6000 
> watcher=kafka.zookeeper.ZooKeeperClient$ZooKeeperClientWatcher$@44821a96 
> (org.apache.zookeeper.ZooKeeper)
>  [2018-07-14 04:38:26,465] INFO Opening socket connection to server 
> *ZOOKEEPER_SERVER_1:PORT*. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:26,477] INFO Socket connection established to 
> *ZOOKEEPER_SERVER_1:PORT*, initiating session 
> (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:26,484] INFO Session establishment complete on server 
> *ZOOKEEPER_SERVER_1:PORT*, sessionid = 0x4005b59eb6a, negotiated timeout 
> = 6000 (org.apache.zookeeper.ClientCnxn)
>  [2018-07-14 04:38:26,496] *INFO Creating 

[jira] [Created] (KAFKA-7537) Only include live brokers in the UpdateMetadataRequest sent to existing brokers if there is no change in the partition states

2018-10-24 Thread Zhanxiang (Patrick) Huang (JIRA)
Zhanxiang (Patrick) Huang created KAFKA-7537:


 Summary: Only include live brokers in the UpdateMetadataRequest 
sent to existing brokers if there is no change in the partition states
 Key: KAFKA-7537
 URL: https://issues.apache.org/jira/browse/KAFKA-7537
 Project: Kafka
  Issue Type: Improvement
  Components: controller
Reporter: Zhanxiang (Patrick) Huang
Assignee: Zhanxiang (Patrick) Huang


Currently if when brokers join/leave the cluster without any partition states 
changes, controller will send out UpdateMetadataRequests containing the states 
of all partitions to all brokers. But for existing brokers in the cluster, the 
metadata diff between controller and the broker should only be the 
"live_brokers" info. Only the brokers with empty metadata cache need the full 
UpdateMetadataRequest. Sending the full UpdateMetadataRequest to all brokers 
can place nonnegligible memory pressure on the controller side.

Let's say in total we have N brokers, M partitions in the cluster and we want 
to add 1 brand new broker in the cluster. With RF=2, the memory footprint per 
partition in the UpdateMetadataRequest is ~200 Bytes. In the current controller 
implementation, if each of the N RequestSendThreads serializes and sends out 
the UpdateMetadataRequest at roughly the same time (which is very likely the 
case), we will end up using *(N+1)*M*200B*. In a large kafka cluster, we can 
have:
{noformat}
N=99
M=100k

Memory usage to send out UpdateMetadataRequest to all brokers:
100 * 100K * 200B = 2G

However, we only need to send out full UpdateMetadataRequest to the newly added 
broker. We only need to include live broker ids (4B * 100 brokers) in the 
UpdateMetadataRequest sent to the existing 99 brokers. So the amount of data 
that is actully needed will be:
1 * 100K * 200B + 99 * (100 * 4B) = ~21M


We will can potentially reduce 2G / 21M = ~95x memory footprint as well as the 
data tranferred in the network.{noformat}
 

This issue kind of hurts the scalability of a kafka cluster. KIP-380 and 
KAFKA-7186 also help to further reduce the controller memory footprint.

 

In terms of implementation, we can keep some in-memory state in the controller 
side to differentiate existing brokers and uninitialized brokers (e.g. brand 
new brokers) so that if there is no change in partition states, we only send 
out live brokers info to existing brokers.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)