[jira] [Commented] (KAFKA-7940) Flaky Test CustomQuotaCallbackTest#testCustomQuotaCallback

2019-03-15 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794108#comment-16794108
 ] 

Matthias J. Sax commented on KAFKA-7940:


Failed again: 
[https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/69/testReport/junit/kafka.api/CustomQuotaCallbackTest/testCustomQuotaCallback/]

StackTrace is different:
{quote}java.lang.AssertionError: Partition [group1_largeTopic,69] metadata not 
propagated after 15000 ms at kafka.utils.TestUtils$.fail(TestUtils.scala:381) 
at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at 
kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:880) at 
kafka.utils.TestUtils$.$anonfun$createTopic$6(TestUtils.scala:360) at 
kafka.utils.TestUtils$.$anonfun$createTopic$6$adapted(TestUtils.scala:359) at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at 
scala.collection.Iterator.foreach(Iterator.scala:941) at 
scala.collection.Iterator.foreach$(Iterator.scala:941) at 
scala.collection.AbstractIterator.foreach(Iterator.scala:1429) at 
scala.collection.MapLike$DefaultKeySet.foreach(MapLike.scala:181) at 
scala.collection.TraversableLike.map(TraversableLike.scala:237) at 
scala.collection.TraversableLike.map$(TraversableLike.scala:230) at 
scala.collection.AbstractSet.scala$collection$SetLike$$super$map(Set.scala:51) 
at scala.collection.SetLike.map(SetLike.scala:104) at 
scala.collection.SetLike.map$(SetLike.scala:104) at 
scala.collection.AbstractSet.map(Set.scala:51) at 
kafka.utils.TestUtils$.createTopic(TestUtils.scala:359) at 
kafka.utils.TestUtils$.createTopic(TestUtils.scala:332) at 
kafka.api.CustomQuotaCallbackTest.createTopic(CustomQuotaCallbackTest.scala:181)
 at 
kafka.api.CustomQuotaCallbackTest.testCustomQuotaCallback(CustomQuotaCallbackTest.scala:136){quote}
STDOUT
{quote}[2019-03-15 16:44:31,140] WARN SASL configuration failed: 
javax.security.auth.login.LoginException: No JAAS configuration section named 
'Client' was found in specified JAAS configuration file: 
'/tmp/kafka8953054928214446748.tmp'. Will continue connection to Zookeeper 
server without SASL authentication, if Zookeeper server allows it. 
(org.apache.zookeeper.ClientCnxn:1011) [2019-03-15 16:44:31,140] ERROR 
[ZooKeeperClient] Auth failed. (kafka.zookeeper.ZooKeeperClient:74) [2019-03-15 
16:44:31,545] WARN SASL configuration failed: 
javax.security.auth.login.LoginException: No JAAS configuration section named 
'Client' was found in specified JAAS configuration file: 
'/tmp/kafka8953054928214446748.tmp'. Will continue connection to Zookeeper 
server without SASL authentication, if Zookeeper server allows it. 
(org.apache.zookeeper.ClientCnxn:1011) [2019-03-15 16:44:31,545] ERROR 
[ZooKeeperClient] Auth failed. (kafka.zookeeper.ZooKeeperClient:74) Completed 
Updating config for entity: user-principal 'scram-admin'. [2019-03-15 
16:44:31,597] WARN SASL configuration failed: 
javax.security.auth.login.LoginException: No JAAS configuration section named 
'Client' was found in specified JAAS configuration file: 
'/tmp/kafka8953054928214446748.tmp'. Will continue connection to Zookeeper 
server without SASL authentication, if Zookeeper server allows it. 
(org.apache.zookeeper.ClientCnxn:1011) [2019-03-15 16:44:31,599] ERROR 
[ZooKeeperClient] Auth failed. (kafka.zookeeper.ZooKeeperClient:74) [2019-03-15 
16:44:31,728] WARN SASL configuration failed: 
javax.security.auth.login.LoginException: No JAAS configuration section named 
'Client' was found in specified JAAS configuration file: 
'/tmp/kafka8953054928214446748.tmp'. Will continue connection to Zookeeper 
server without SASL authentication, if Zookeeper server allows it. 
(org.apache.zookeeper.ClientCnxn:1011) [2019-03-15 16:44:31,728] ERROR 
[ZooKeeperClient] Auth failed. (kafka.zookeeper.ZooKeeperClient:74) [2019-03-15 
16:44:32,592] WARN SASL configuration failed: 
javax.security.auth.login.LoginException: No JAAS configuration section named 
'Client' was found in specified JAAS configuration file: 
'/tmp/kafka8953054928214446748.tmp'. Will continue connection to Zookeeper 
server without SASL authentication, if Zookeeper server allows it. 
(org.apache.zookeeper.ClientCnxn:1011) [2019-03-15 16:44:32,604] ERROR 
[ZooKeeperClient] Auth failed. (kafka.zookeeper.ZooKeeperClient:74) Completed 
Updating config for entity: user-principal 'group0_user1'. [2019-03-15 
16:44:36,625] WARN SASL configuration failed: 
javax.security.auth.login.LoginException: No JAAS configuration section named 
'Client' was found in specified JAAS configuration file: 
'/tmp/kafka8953054928214446748.tmp'. Will continue connection to Zookeeper 
server without SASL authentication, if Zookeeper server allows it. 
(org.apache.zookeeper.ClientCnxn:1011) [2019-03-15 16:44:36,625] ERROR 
[ZooKeeperClient] Auth failed. (kafka.zookeeper.ZooKeeperClient:74) Completed 
Updating config for entity: user-principal 'group0_user2'. 

[jira] [Updated] (KAFKA-7855) Kafka Streams Maven Archetype quickstart fails to compile out of the box

2019-03-15 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-7855:
---
Affects Version/s: (was: 2.0.1)
   2.0.0

> Kafka Streams Maven Archetype quickstart fails to compile out of the box
> 
>
> Key: KAFKA-7855
> URL: https://issues.apache.org/jira/browse/KAFKA-7855
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.0
> Environment: Java 8, OS X 10.13.6
>Reporter: Michael Drogalis
>Assignee: Kristian Aurlien
>Priority: Major
>  Labels: newbie++
> Fix For: 2.0.2, 2.3.0, 2.1.2, 2.2.1
>
> Attachments: output.log
>
>
> When I follow the [quickstart 
> tutorial|https://kafka.apache.org/21/documentation/streams/tutorial] and 
> issue the command to set up a new Maven project, the generated example fails 
> to compile. Adding a Produced.with() on the source seems to fix this. I've 
> attached the compiler output.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7855) Kafka Streams Maven Archetype quickstart fails to compile out of the box

2019-03-15 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-7855:
---
Affects Version/s: (was: 2.1.0)
   2.0.1

> Kafka Streams Maven Archetype quickstart fails to compile out of the box
> 
>
> Key: KAFKA-7855
> URL: https://issues.apache.org/jira/browse/KAFKA-7855
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.1
> Environment: Java 8, OS X 10.13.6
>Reporter: Michael Drogalis
>Assignee: Kristian Aurlien
>Priority: Major
>  Labels: newbie++
> Attachments: output.log
>
>
> When I follow the [quickstart 
> tutorial|https://kafka.apache.org/21/documentation/streams/tutorial] and 
> issue the command to set up a new Maven project, the generated example fails 
> to compile. Adding a Produced.with() on the source seems to fix this. I've 
> attached the compiler output.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7855) Kafka Streams Maven Archetype quickstart fails to compile out of the box

2019-03-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794101#comment-16794101
 ] 

ASF GitHub Bot commented on KAFKA-7855:
---

mjsax commented on pull request #6194: KAFKA-7855: Kafka Streams Maven 
Archetype quickstart fails to compile out of the box
URL: https://github.com/apache/kafka/pull/6194
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Kafka Streams Maven Archetype quickstart fails to compile out of the box
> 
>
> Key: KAFKA-7855
> URL: https://issues.apache.org/jira/browse/KAFKA-7855
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.1.0
> Environment: Java 8, OS X 10.13.6
>Reporter: Michael Drogalis
>Assignee: Kristian Aurlien
>Priority: Major
>  Labels: newbie++
> Attachments: output.log
>
>
> When I follow the [quickstart 
> tutorial|https://kafka.apache.org/21/documentation/streams/tutorial] and 
> issue the command to set up a new Maven project, the generated example fails 
> to compile. Adding a Produced.with() on the source seems to fix this. I've 
> attached the compiler output.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8116) Add Kafka Streams archetype for Java11

2019-03-15 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-8116:
--

 Summary: Add Kafka Streams archetype for Java11
 Key: KAFKA-8116
 URL: https://issues.apache.org/jira/browse/KAFKA-8116
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Matthias J. Sax


In https://issues.apache.org/jira/browse/KAFKA-5727 we added an archetype for 
Kafka Streams. However, this archetype only works for Java8 but not for Java11. 
Thus, we should add a new archetype project for Java11.

This ticket requires a KIP: 
[https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-8027) Gradual decline in performance of CachingWindowStore provider when number of keys grow

2019-03-15 Thread Sophie Blee-Goldman (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793994#comment-16793994
 ] 

Sophie Blee-Goldman edited comment on KAFKA-8027 at 3/15/19 10:28 PM:
--

Hi [~prashantideal], I have been looking into this and have two PRs aimed at 
improving performance of segmented stores with caching enabled. Would you be 
able to test either or both of them out, and let me know if they improve things 
at all? You can find the first PR 
[here|[https://github.com/apache/kafka/pull/6433]] and the second one 
[here|[https://github.com/apache/kafka/pull/6448]]

Keep in mind these are just improvements to the caching layer and are unlikely 
to result in overall better fetching performance than withCachingDisabled, 
since as you point out for range queries we must search the underlying 
RocksDBStore anyway. If you don't need caching for other reasons (eg reducing 
downstream traffic or writes to RocksDB) and can afford to turn it off, I 
recommend doing so. 


was (Author: ableegoldman):
Hi [~prashantideal], I have been looking into this and have two PRs aimed at 
improving performance of segmented stores with caching enabled. Would you be 
able to test either or both of them out, and let me know if they improve things 
at all? You can find the first PR 
[here|[https://github.com/apache/kafka/pull/6433]] and the second one 
[here|[https://github.com/apache/kafka/pull/6448]]

Keep in mind these are just improvements to the caching layer and are unlikely 
to result in overall better performance than withCachingDisabled, since as you 
point out for range queries we must search the underlying RocksDBStore anyway. 
If you don't need caching for other reasons (eg reducing downstream traffic) 
and can afford to turn it off, I recommend doing so. 

> Gradual decline in performance of CachingWindowStore provider when number of 
> keys grow
> --
>
> Key: KAFKA-8027
> URL: https://issues.apache.org/jira/browse/KAFKA-8027
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.1.0
>Reporter: Prashant
>Priority: Major
>  Labels: interactivequ, kafka-streams
>
> We observed this during a performance test of our stream application which 
> tracks user's activity and provides REST interface to query the window state 
> store.  We used default configuration of Materialized i.e. withCachingEnabled 
> for storing user behaviour stats in a window state store 
> (CompositeWindowStore with CachingWindowStore as underlyin which internally 
> uses RocksDBStore for persistent).  
> While querying window store with store.fetch(key, long, long), it internally 
> tries to fetch the range from ThreadCache which uses a byte iterator to 
> search for a key in cache and on a cache miss it goes to RocksDBStore for 
> result. So, when number of keys in cache becomes large this ThreadCache 
> search starts taking time (range Iterator on all keys) which impacts 
> WindowStore query performance.
>  
> Workaround: If we disable cache with switch on Materialized instance i.e. 
> withCachingDisabled, key search is delegated directly to RocksDBStore which 
> is way faster and completed search in microseconds against millis in case of 
> CachingWindowStore.  
>  
> Stats: With Unique users > 0.5M, random search for a key i.e. UserId:
>  
> withCachingEnabled :  40 < t < 80ms (upper bound increases as unique users 
> grow)
> withCahingDisabled: t < 1ms (Almost constant time)      



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8115) Flaky Test CoordinatorTest#testTaskRequestWithOldStartMsGetsUpdated

2019-03-15 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-8115:
--

 Summary: Flaky Test 
CoordinatorTest#testTaskRequestWithOldStartMsGetsUpdated
 Key: KAFKA-8115
 URL: https://issues.apache.org/jira/browse/KAFKA-8115
 Project: Kafka
  Issue Type: Bug
  Components: core, unit tests
Affects Versions: 2.3.0
Reporter: Matthias J. Sax
 Fix For: 2.3.0


[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3254/testReport/junit/org.apache.kafka.trogdor.coordinator/CoordinatorTest/testTaskRequestWithOldStartMsGetsUpdated/]
{quote}org.junit.runners.model.TestTimedOutException: test timed out after 
12 milliseconds at java.base@11.0.1/jdk.internal.misc.Unsafe.park(Native 
Method) at 
java.base@11.0.1/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
 at 
java.base@11.0.1/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2123)
 at 
java.base@11.0.1/java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1454)
 at 
java.base@11.0.1/java.util.concurrent.Executors$DelegatedExecutorService.awaitTermination(Executors.java:709)
 at 
app//org.apache.kafka.trogdor.rest.JsonRestServer.waitForShutdown(JsonRestServer.java:157)
 at app//org.apache.kafka.trogdor.agent.Agent.waitForShutdown(Agent.java:123) 
at 
app//org.apache.kafka.trogdor.common.MiniTrogdorCluster.close(MiniTrogdorCluster.java:285)
 at 
app//org.apache.kafka.trogdor.coordinator.CoordinatorTest.testTaskRequestWithOldStartMsGetsUpdated(CoordinatorTest.java:596)
 at 
java.base@11.0.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method) at 
java.base@11.0.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
java.base@11.0.1/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.base@11.0.1/java.lang.reflect.Method.invoke(Method.java:566) at 
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
 at 
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
 at 
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
 at 
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
 at java.base@11.0.1/java.util.concurrent.FutureTask.run(FutureTask.java:264) 
at java.base@11.0.1/java.lang.Thread.run(Thread.java:834){quote}
STDOUT
{quote}[2019-03-15 09:23:41,364] INFO Creating MiniTrogdorCluster with agents: 
node02 and coordinator: node01 
(org.apache.kafka.trogdor.common.MiniTrogdorCluster:135) [2019-03-15 
09:23:41,595] INFO Logging initialized @13340ms to 
org.eclipse.jetty.util.log.Slf4jLog (org.eclipse.jetty.util.log:193) 
[2019-03-15 09:23:41,752] INFO Starting REST server 
(org.apache.kafka.trogdor.rest.JsonRestServer:89) [2019-03-15 09:23:41,912] 
INFO Registered resource 
org.apache.kafka.trogdor.agent.AgentRestResource@3fa38ceb 
(org.apache.kafka.trogdor.rest.JsonRestServer:94) [2019-03-15 09:23:42,178] 
INFO jetty-9.4.14.v20181114; built: 2018-11-14T21:20:31.478Z; git: 
c4550056e785fb5665914545889f21dc136ad9e6; jvm 11.0.1+13-LTS 
(org.eclipse.jetty.server.Server:370) [2019-03-15 09:23:42,360] INFO 
DefaultSessionIdManager workerName=node0 (org.eclipse.jetty.server.session:365) 
[2019-03-15 09:23:42,362] INFO No SessionScavenger set, using defaults 
(org.eclipse.jetty.server.session:370) [2019-03-15 09:23:42,370] INFO node0 
Scavenging every 66ms (org.eclipse.jetty.server.session:149) [2019-03-15 
09:23:44,412] INFO Started 
o.e.j.s.ServletContextHandler@335a5293\{/,null,AVAILABLE} 
(org.eclipse.jetty.server.handler.ContextHandler:855) [2019-03-15 09:23:44,473] 
INFO Started ServerConnector@79a93bf1\{HTTP/1.1,[http/1.1]}{0.0.0.0:33477} 
(org.eclipse.jetty.server.AbstractConnector:292) [2019-03-15 09:23:44,474] INFO 
Started @16219ms (org.eclipse.jetty.server.Server:407) [2019-03-15 
09:23:44,475] INFO REST server listening at [http://127.0.1.1:33477/] 
(org.apache.kafka.trogdor.rest.JsonRestServer:123) [2019-03-15 09:23:44,484] 
INFO Starting REST server (org.apache.kafka.trogdor.rest.JsonRestServer:89) 
[2019-03-15 09:23:44,485] INFO Registered resource 
org.apache.kafka.trogdor.coordinator.CoordinatorRestResource@2e06ee92 
(org.apache.kafka.trogdor.rest.JsonRestServer:94) [2019-03-15 09:23:44,486] 
INFO jetty-9.4.14.v20181114; built: 2018-11-14T21:20:31.478Z; git: 
c4550056e785fb5665914545889f21dc136ad9e6; jvm 11.0.1+13-LTS 
(org.eclipse.jetty.server.Server:370) [2019-03-15 09:23:44,536] INFO 
DefaultSessionIdManager workerName=node0 (org.eclipse.jetty.server.session:365) 
[2019-03-15 

[jira] [Created] (KAFKA-8114) Flaky Test DelegationTokenEndToEndAuthorizationTest#testNoGroupAcl

2019-03-15 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-8114:
--

 Summary: Flaky Test 
DelegationTokenEndToEndAuthorizationTest#testNoGroupAcl
 Key: KAFKA-8114
 URL: https://issues.apache.org/jira/browse/KAFKA-8114
 Project: Kafka
  Issue Type: Bug
  Components: core, unit tests
Affects Versions: 2.3.0
Reporter: Matthias J. Sax
 Fix For: 2.3.0


[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3254/testReport/junit/kafka.api/DelegationTokenEndToEndAuthorizationTest/testNoGroupAcl/]
{quote}java.util.concurrent.ExecutionException: 
org.apache.kafka.common.errors.SaslAuthenticationException: Authentication 
failed during authentication due to invalid credentials with SASL mechanism 
SCRAM-SHA-256 at 
org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
 at 
org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
 at 
org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
 at 
org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260) 
at 
kafka.api.DelegationTokenEndToEndAuthorizationTest.createDelegationToken(DelegationTokenEndToEndAuthorizationTest.scala:88)
 at 
kafka.api.DelegationTokenEndToEndAuthorizationTest.configureSecurityAfterServersStart(DelegationTokenEndToEndAuthorizationTest.scala:63)
 at 
kafka.integration.KafkaServerTestHarness.setUp(KafkaServerTestHarness.scala:107)
 at kafka.api.IntegrationTestHarness.doSetup(IntegrationTestHarness.scala:81) 
at kafka.api.IntegrationTestHarness.setUp(IntegrationTestHarness.scala:73) at 
kafka.api.EndToEndAuthorizationTest.setUp(EndToEndAuthorizationTest.scala:183) 
at 
kafka.api.DelegationTokenEndToEndAuthorizationTest.setUp(DelegationTokenEndToEndAuthorizationTest.scala:74){quote}
STDOUT
{quote}Adding ACLs for resource `Cluster:LITERAL:kafka-cluster`: 
User:scram-admin has Allow permission for operations: ClusterAction from hosts: 
* Current ACLs for resource `Cluster:LITERAL:kafka-cluster`: User:scram-admin 
has Allow permission for operations: ClusterAction from hosts: * Adding ACLs 
for resource `Topic:LITERAL:*`: User:scram-admin has Allow permission for 
operations: Read from hosts: * Current ACLs for resource `Topic:LITERAL:*`: 
User:scram-admin has Allow permission for operations: Read from hosts: * 
Completed Updating config for entity: user-principal 'scram-admin'. Completed 
Updating config for entity: user-principal 'scram-user'. Adding ACLs for 
resource `Topic:LITERAL:e2etopic`: User:scram-user has Allow permission for 
operations: Write from hosts: * User:scram-user has Allow permission for 
operations: Create from hosts: * User:scram-user has Allow permission for 
operations: Describe from hosts: * Current ACLs for resource 
`Topic:LITERAL:e2etopic`: User:scram-user has Allow permission for operations: 
Write from hosts: * User:scram-user has Allow permission for operations: Create 
from hosts: * User:scram-user has Allow permission for operations: Describe 
from hosts: * Adding ACLs for resource `Group:LITERAL:group`: User:scram-user 
has Allow permission for operations: Read from hosts: * Current ACLs for 
resource `Group:LITERAL:group`: User:scram-user has Allow permission for 
operations: Read from hosts: * Current ACLs for resource 
`Topic:LITERAL:e2etopic`: User:scram-user has Allow permission for operations: 
Write from hosts: * User:scram-user has Allow permission for operations: Create 
from hosts: * Current ACLs for resource `Topic:LITERAL:e2etopic`: 
User:scram-user has Allow permission for operations: Create from hosts: * 
[2019-03-15 09:58:16,481] ERROR [Consumer clientId=consumer-99, groupId=group] 
Topic authorization failed for topics [e2etopic] 
(org.apache.kafka.clients.Metadata:297) [2019-03-15 09:58:17,527] WARN Unable 
to read additional data from client sessionid 0x104549c2b88000a, likely client 
has closed socket (org.apache.zookeeper.server.NIOServerCnxn:376) Adding ACLs 
for resource `Cluster:LITERAL:kafka-cluster`: User:scram-admin has Allow 
permission for operations: ClusterAction from hosts: * Current ACLs for 
resource `Cluster:LITERAL:kafka-cluster`: User:scram-admin has Allow permission 
for operations: ClusterAction from hosts: * Adding ACLs for resource 
`Topic:LITERAL:*`: User:scram-admin has Allow permission for operations: Read 
from hosts: * Current ACLs for resource `Topic:LITERAL:*`: User:scram-admin has 
Allow permission for operations: Read from hosts: * Completed Updating config 
for entity: user-principal 'scram-admin'. Completed Updating config for entity: 
user-principal 'scram-user'. Adding ACLs for resource `Topic:PREFIXED:e2e`: 
User:scram-user has Allow permission for operations: Read from hosts: * 
User:scram-user has Allow permission for operations: Describe from hosts: * 
User:scram-user has Allow permission for operations: Write from hosts: * 

[jira] [Commented] (KAFKA-8027) Gradual decline in performance of CachingWindowStore provider when number of keys grow

2019-03-15 Thread Sophie Blee-Goldman (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793994#comment-16793994
 ] 

Sophie Blee-Goldman commented on KAFKA-8027:


Hi [~prashantideal], I have been looking into this and have two PRs aimed at 
improving performance of segmented stores with caching enabled. Would you be 
able to test either or both of them out, and let me know if they improve things 
at all? You can find the first PR 
[here|[https://github.com/apache/kafka/pull/6433]] and the second one 
[here|[https://github.com/apache/kafka/pull/6448]]

Keep in mind these are just improvements to the caching layer and are unlikely 
to result in overall better performance than withCachingDisabled, since as you 
point out for range queries we must search the underlying RocksDBStore anyway. 
If you don't need caching for other reasons (eg reducing downstream traffic) 
and can afford to turn it off, I recommend doing so. 

> Gradual decline in performance of CachingWindowStore provider when number of 
> keys grow
> --
>
> Key: KAFKA-8027
> URL: https://issues.apache.org/jira/browse/KAFKA-8027
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.1.0
>Reporter: Prashant
>Priority: Major
>  Labels: interactivequ, kafka-streams
>
> We observed this during a performance test of our stream application which 
> tracks user's activity and provides REST interface to query the window state 
> store.  We used default configuration of Materialized i.e. withCachingEnabled 
> for storing user behaviour stats in a window state store 
> (CompositeWindowStore with CachingWindowStore as underlyin which internally 
> uses RocksDBStore for persistent).  
> While querying window store with store.fetch(key, long, long), it internally 
> tries to fetch the range from ThreadCache which uses a byte iterator to 
> search for a key in cache and on a cache miss it goes to RocksDBStore for 
> result. So, when number of keys in cache becomes large this ThreadCache 
> search starts taking time (range Iterator on all keys) which impacts 
> WindowStore query performance.
>  
> Workaround: If we disable cache with switch on Materialized instance i.e. 
> withCachingDisabled, key search is delegated directly to RocksDBStore which 
> is way faster and completed search in microseconds against millis in case of 
> CachingWindowStore.  
>  
> Stats: With Unique users > 0.5M, random search for a key i.e. UserId:
>  
> withCachingEnabled :  40 < t < 80ms (upper bound increases as unique users 
> grow)
> withCahingDisabled: t < 1ms (Almost constant time)      



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8030) Flaky Test TopicCommandWithAdminClientTest#testDescribeUnderMinIsrPartitionsMixed

2019-03-15 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793972#comment-16793972
 ] 

Matthias J. Sax commented on KAFKA-8030:


Failed again: 
[https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk11/detail/kafka-trunk-jdk11/376/tests]

> Flaky Test 
> TopicCommandWithAdminClientTest#testDescribeUnderMinIsrPartitionsMixed
> -
>
> Key: KAFKA-8030
> URL: https://issues.apache.org/jira/browse/KAFKA-8030
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Assignee: Viktor Somogyi-Vass
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.3.0
>
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/2830/testReport/junit/kafka.admin/TopicCommandWithAdminClientTest/testDescribeUnderMinIsrPartitionsMixed/]
> {quote}java.lang.AssertionError at org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> kafka.admin.TopicCommandWithAdminClientTest.testDescribeUnderMinIsrPartitionsMixed(TopicCommandWithAdminClientTest.scala:602){quote}
> STDERR
> {quote}Option "[replica-assignment]" can't be used with option 
> "[partitions]"{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8091) Flaky test DynamicBrokerReconfigurationTest#testAddRemoveSaslListener

2019-03-15 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793980#comment-16793980
 ] 

Matthias J. Sax commented on KAFKA-8091:


Ack. Thanks!

> Flaky test  DynamicBrokerReconfigurationTest#testAddRemoveSaslListener 
> ---
>
> Key: KAFKA-8091
> URL: https://issues.apache.org/jira/browse/KAFKA-8091
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.2.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 2.3.0, 2.2.1
>
>
> See KAFKA-6824 for details. Since the SSL version of the test is currently 
> skipped using @Ignore, fixing this for SASL first and wait for that to be 
> stable before re-enabling SSL tests under KAFKA-6824. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7965) Flaky Test ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup

2019-03-15 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793973#comment-16793973
 ] 

Matthias J. Sax commented on KAFKA-7965:


Failed again with "Should have received an class 
org.apache.kafka.common.errors.GroupMaxSizeReachedException during the cluster 
roll": 
[https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3469/tests]

 

> Flaky Test 
> ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup
> 
>
> Key: KAFKA-7965
> URL: https://issues.apache.org/jira/browse/KAFKA-7965
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, unit tests
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Matthias J. Sax
>Assignee: Stanislav Kozlovski
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.3.0, 2.2.1
>
>
> To get stable nightly builds for `2.2` release, I create tickets for all 
> observed test failures.
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/21/]
> {quote}java.lang.AssertionError: Received 0, expected at least 68 at 
> org.junit.Assert.fail(Assert.java:88) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> kafka.api.ConsumerBounceTest.receiveAndCommit(ConsumerBounceTest.scala:557) 
> at 
> kafka.api.ConsumerBounceTest.$anonfun$testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup$1(ConsumerBounceTest.scala:320)
>  at 
> kafka.api.ConsumerBounceTest.$anonfun$testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup$1$adapted(ConsumerBounceTest.scala:319)
>  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at 
> kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup(ConsumerBounceTest.scala:319){quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8113) Flaky Test ListOffsetsRequestTest#testResponseIncludesLeaderEpoch

2019-03-15 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-8113:
--

 Summary: Flaky Test 
ListOffsetsRequestTest#testResponseIncludesLeaderEpoch
 Key: KAFKA-8113
 URL: https://issues.apache.org/jira/browse/KAFKA-8113
 Project: Kafka
  Issue Type: Bug
  Components: core, unit tests
Affects Versions: 2.3.0
Reporter: Matthias J. Sax
 Fix For: 2.3.0


[https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3468/tests]
{quote}java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:87)
at org.junit.Assert.assertTrue(Assert.java:42)
at org.junit.Assert.assertTrue(Assert.java:53)
at 
kafka.server.ListOffsetsRequestTest.fetchOffsetAndEpoch$1(ListOffsetsRequestTest.scala:136)
at 
kafka.server.ListOffsetsRequestTest.testResponseIncludesLeaderEpoch(ListOffsetsRequestTest.scala:151){quote}
STDOUT
{quote}[2019-03-15 17:16:13,029] ERROR [ReplicaFetcher replicaId=2, leaderId=1, 
fetcherId=0] Error for partition topic-0 at offset 0 
(kafka.server.ReplicaFetcherThread:76)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
does not host this topic-partition.
[2019-03-15 17:16:13,231] ERROR [KafkaApi-0] Error while responding to offset 
request (kafka.server.KafkaApis:76)
org.apache.kafka.common.errors.ReplicaNotAvailableException: Partition topic-0 
is not available{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8112) Add system test to detect compatibility issues when requests are updated

2019-03-15 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-8112:
-

 Summary: Add system test to detect compatibility issues when 
requests are updated
 Key: KAFKA-8112
 URL: https://issues.apache.org/jira/browse/KAFKA-8112
 Project: Kafka
  Issue Type: Test
  Components: system tests
Reporter: Rajini Sivaram


Both compatibility_test_new_broker_test.py and upgrade_test.py passed with the 
Metadata version issue in KAFKA-8111. We didn't have a full system test build 
after the changes, so not sure if there are other tests which may have failed. 
This is to make sure that we add a test that would fail for similar 
compatibility issues in future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8111) KafkaProducer can't produce data

2019-03-15 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-8111:

Priority: Critical  (was: Major)

> KafkaProducer can't produce data
> 
>
> Key: KAFKA-8111
> URL: https://issues.apache.org/jira/browse/KAFKA-8111
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, core
>Affects Versions: 2.3.0
>Reporter: John Roesler
>Assignee: Rajini Sivaram
>Priority: Critical
>
> Using a Producer from the current trunk (a6691fb79), I'm unable to produce 
> data to a 2.2 broker.
> tl;dr;, I narrowed down the problem to 
> [https://github.com/apache/kafka/commit/a42f16f98] . My hypothesis is that 
> some part of that commit broke backward compatibility with older brokers.
>  
> Repro steps:
> I'm using this Producer config:
> {noformat}
> final Properties properties = new Properties();
> properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, BROKER);
> properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, 
> StringSerializer.class.getCanonicalName());
> properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, 
> StringSerializer.class.getCanonicalName());
> return properties;{noformat}
>  # create a simple Producer to produce test data to a broker
>  # build against commmit a42f16f98 
>  # start an older broker. (I was using 2.1, and someone else reproduced it 
> with 2.2)
>  # run your producer and note that it doesn't produce data (seems to hang, I 
> see it produce 2 records in 1 minute)
>  # build against the predecessor commit 65aea1f36
>  # run your producer and note that it DOES produce data (I see it produce 1M 
> records every 15 second)
> I've also confirmed that if I check out the current trunk (a6691fb79e2c55b3) 
> and revert a42f16f98, I also observe that it produces as expected (1M every 
> 15 seconds).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8111) KafkaProducer can't produce data

2019-03-15 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-8111:

Labels: blocker  (was: )

> KafkaProducer can't produce data
> 
>
> Key: KAFKA-8111
> URL: https://issues.apache.org/jira/browse/KAFKA-8111
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, core
>Affects Versions: 2.3.0
>Reporter: John Roesler
>Assignee: Rajini Sivaram
>Priority: Critical
>  Labels: blocker
>
> Using a Producer from the current trunk (a6691fb79), I'm unable to produce 
> data to a 2.2 broker.
> tl;dr;, I narrowed down the problem to 
> [https://github.com/apache/kafka/commit/a42f16f98] . My hypothesis is that 
> some part of that commit broke backward compatibility with older brokers.
>  
> Repro steps:
> I'm using this Producer config:
> {noformat}
> final Properties properties = new Properties();
> properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, BROKER);
> properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, 
> StringSerializer.class.getCanonicalName());
> properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, 
> StringSerializer.class.getCanonicalName());
> return properties;{noformat}
>  # create a simple Producer to produce test data to a broker
>  # build against commmit a42f16f98 
>  # start an older broker. (I was using 2.1, and someone else reproduced it 
> with 2.2)
>  # run your producer and note that it doesn't produce data (seems to hang, I 
> see it produce 2 records in 1 minute)
>  # build against the predecessor commit 65aea1f36
>  # run your producer and note that it DOES produce data (I see it produce 1M 
> records every 15 second)
> I've also confirmed that if I check out the current trunk (a6691fb79e2c55b3) 
> and revert a42f16f98, I also observe that it produces as expected (1M every 
> 15 seconds).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-3083) a soft failure in controller may leave a topic partition in an inconsistent state

2019-03-15 Thread Shannon Carey (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793855#comment-16793855
 ] 

Shannon Carey commented on KAFKA-3083:
--

Is there a way to make this less likely to occur in versions before the fix? 
Would using a larger value for zookeeper.session.timeout.ms make any 
difference? I assume that "broker A's session expires" refers to the broker's 
Zookeeper session?

> a soft failure in controller may leave a topic partition in an inconsistent 
> state
> -
>
> Key: KAFKA-3083
> URL: https://issues.apache.org/jira/browse/KAFKA-3083
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Onur Karaman
>Priority: Major
>  Labels: reliability
> Fix For: 1.1.0
>
>
> The following sequence can happen.
> 1. Broker A is the controller and is in the middle of processing a broker 
> change event. As part of this process, let's say it's about to shrink the isr 
> of a partition.
> 2. Then broker A's session expires and broker B takes over as the new 
> controller. Broker B sends the initial leaderAndIsr request to all brokers.
> 3. Broker A continues by shrinking the isr of the partition in ZK and sends 
> the new leaderAndIsr request to the broker (say C) that leads the partition. 
> Broker C will reject this leaderAndIsr since the request comes from a 
> controller with an older epoch. Now we could be in a situation that Broker C 
> thinks the isr has all replicas, but the isr stored in ZK is different.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8091) Flaky test DynamicBrokerReconfigurationTest#testAddRemoveSaslListener

2019-03-15 Thread Rajini Sivaram (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793849#comment-16793849
 ] 

Rajini Sivaram commented on KAFKA-8091:
---

[~mjsax] Merged another fix today for that issue (after that failed run). Will 
wait and see if that has fixed the issue.

> Flaky test  DynamicBrokerReconfigurationTest#testAddRemoveSaslListener 
> ---
>
> Key: KAFKA-8091
> URL: https://issues.apache.org/jira/browse/KAFKA-8091
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.2.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 2.3.0, 2.2.1
>
>
> See KAFKA-6824 for details. Since the SSL version of the test is currently 
> skipped using @Ignore, fixing this for SASL first and wait for that to be 
> stable before re-enabling SSL tests under KAFKA-6824. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8106) Remove unnecessary decompression operation when logValidator do validation.

2019-03-15 Thread Flower.min (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flower.min updated KAFKA-8106:
--
Description: 
       We do performance testing about kafka in specific scenarios as described 
below .We build a kafka cluster with one broker,and create topics with 
different number of partitions;then we start lots of producer processes to send 
large amounts of messages to one of the topics at one  testing .

*_specific scenario_*
 # *_server :_* cpu:2*16 ; MemTotal : 256G,Ethernet controller:Intel 
Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection ; SSD.
 # _*Topics :*_  topic1:50 partitions,topic2:100partitions,topic3:200 
partitions,...,2000 partiitons
 # _*size of Single Message* :_ 1024B

*_Config of KafkaProducer :_ __* **
 # _*compression.type*:_ lz4
 # _*linger.ms*:_ 1000ms/2000ms/5000ms
 # *_batch.size:_* _1_6384B/10240B/102400B
 # _*buffer.memory:*_ 134217728B

*_The best result of performance testing:_*
 # *_Pe_r*_*formance*:_2300 messages/s.
 # *_Resource usage:_* Network inflow rate : 550M/s~610MB/s,CPU(%) : 
97%~99%,Disk write speed:550M/s~610MB/s .

_*Phenomenon and  my doubt:*_

      _**      The upper limit of CPU usage has been reached  But  it does not 
reach the upper limit of the bandwidth of the server  network. We are doubtful 
about which  cost too much CPU time and we want to Improve  performance and 
reduces CPU usage of kafka server._

       _**_       

 

 

 

  was:
       We do performance testing about kafka in specific scenarios as described 
below .We build a kafka cluster with one broker,and create topics with 
different number of partitions;then we start lots of producer processes to send 
large amounts of messages to one of the topics at one  testing .

*_specific scenario_*
 # *_server :_* cpu:2*16 ; MemTotal : 256G,Ethernet controller:Intel 
Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection ; SSD.
 # _*Topics :* _ topic1:50 partitions,topic2:100partitions,topic3:200 
partitions,...,2000 partiitons
 # _*size of Single Message* :_ 1024B

*_Config of KafkaProducer :_ __* **
 # _*compression.type*:_ lz4
 # _*linger.ms*:_ 1000ms/2000ms/5000ms
 # *_batch.size:_* _1_6384B/10240B/102400B
 # _*buffer.memory:*_ 134217728B

*_The best result of performance testing:_*
 # *_Performance:_*2300 messages/s.
 # *_Resource usage:_* Network inflow rate : 550M/s~610MB/s,CPU(%) : 
97%~99%,Disk write speed:550M/s~610MB/s .

_*Phenomenon and  my doubt:*_

      _**      The upper limit of CPU usage has been reached  But  it does not 
reach the upper limit of the bandwidth of the server  network. We are doubtful 
about which  cost too much CPU time and we want to Improve  performance and 
reduces CPU usage of kafka server._

       _**_       

 

 

 


> Remove unnecessary decompression operation when logValidator  do validation.
> 
>
> Key: KAFKA-8106
> URL: https://issues.apache.org/jira/browse/KAFKA-8106
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, core
>Affects Versions: 2.1.1
>Reporter: Flower.min
>Priority: Major
>
>        We do performance testing about kafka in specific scenarios as 
> described below .We build a kafka cluster with one broker,and create topics 
> with different number of partitions;then we start lots of producer processes 
> to send large amounts of messages to one of the topics at one  testing .
> *_specific scenario_*
>  # *_server :_* cpu:2*16 ; MemTotal : 256G,Ethernet controller:Intel 
> Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection ; SSD.
>  # _*Topics :*_  topic1:50 partitions,topic2:100partitions,topic3:200 
> partitions,...,2000 partiitons
>  # _*size of Single Message* :_ 1024B
> *_Config of KafkaProducer :_ __* **
>  # _*compression.type*:_ lz4
>  # _*linger.ms*:_ 1000ms/2000ms/5000ms
>  # *_batch.size:_* _1_6384B/10240B/102400B
>  # _*buffer.memory:*_ 134217728B
> *_The best result of performance testing:_*
>  # *_Pe_r*_*formance*:_2300 messages/s.
>  # *_Resource usage:_* Network inflow rate : 550M/s~610MB/s,CPU(%) : 
> 97%~99%,Disk write speed:550M/s~610MB/s .
> _*Phenomenon and  my doubt:*_
>       _**      The upper limit of CPU usage has been reached  But  it does 
> not reach the upper limit of the bandwidth of the server  network. We are 
> doubtful about which  cost too much CPU time and we want to Improve  
> performance and reduces CPU usage of kafka server._
>        _**_       
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8106) Remove unnecessary decompression operation when logValidator do validation.

2019-03-15 Thread Flower.min (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flower.min updated KAFKA-8106:
--
Description: 
       We do performance testing about kafka in specific scenarios as described 
below .We build a kafka cluster with one broker,and create topics with 
different number of partitions;then we start lots of producer processes to send 
large amounts of messages to one of the topics at one  testing .

*_specific scenario_*
 # *_server :_* cpu:2*16 ; MemTotal : 256G,Ethernet controller:Intel 
Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection ; SSD.
 # _*Topics :* _ topic1:50 partitions,topic2:100partitions,topic3:200 
partitions,...,2000 partiitons
 # _*size of Single Message* :_ 1024B

*_Config of KafkaProducer :_ __* **
 # _*compression.type*:_ lz4
 # _*linger.ms*:_ 1000ms/2000ms/5000ms
 # *_batch.size:_* _1_6384B/10240B/102400B
 # _*buffer.memory:*_ 134217728B

*_The best result of performance testing:_*
 # *_Performance:_*2300 messages/s.
 # *_Resource usage:_* Network inflow rate : 550M/s~610MB/s,CPU(%) : 
97%~99%,Disk write speed:550M/s~610MB/s .

_*Phenomenon and  my doubt:*_

      _**      The upper limit of CPU usage has been reached  But  it does not 
reach the upper limit of the bandwidth of the server  network. We are doubtful 
about which  cost too much CPU time and we want to Improve  performance and 
reduces CPU usage of kafka server._

       _**_       

 

 

 

> Remove unnecessary decompression operation when logValidator  do validation.
> 
>
> Key: KAFKA-8106
> URL: https://issues.apache.org/jira/browse/KAFKA-8106
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, core
>Affects Versions: 2.1.1
>Reporter: Flower.min
>Priority: Major
>
>        We do performance testing about kafka in specific scenarios as 
> described below .We build a kafka cluster with one broker,and create topics 
> with different number of partitions;then we start lots of producer processes 
> to send large amounts of messages to one of the topics at one  testing .
> *_specific scenario_*
>  # *_server :_* cpu:2*16 ; MemTotal : 256G,Ethernet controller:Intel 
> Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection ; SSD.
>  # _*Topics :* _ topic1:50 partitions,topic2:100partitions,topic3:200 
> partitions,...,2000 partiitons
>  # _*size of Single Message* :_ 1024B
> *_Config of KafkaProducer :_ __* **
>  # _*compression.type*:_ lz4
>  # _*linger.ms*:_ 1000ms/2000ms/5000ms
>  # *_batch.size:_* _1_6384B/10240B/102400B
>  # _*buffer.memory:*_ 134217728B
> *_The best result of performance testing:_*
>  # *_Performance:_*2300 messages/s.
>  # *_Resource usage:_* Network inflow rate : 550M/s~610MB/s,CPU(%) : 
> 97%~99%,Disk write speed:550M/s~610MB/s .
> _*Phenomenon and  my doubt:*_
>       _**      The upper limit of CPU usage has been reached  But  it does 
> not reach the upper limit of the bandwidth of the server  network. We are 
> doubtful about which  cost too much CPU time and we want to Improve  
> performance and reduces CPU usage of kafka server._
>        _**_       
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8106) Remove

2019-03-15 Thread Flower.min (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flower.min updated KAFKA-8106:
--
Summary: Remove   (was: Remove unnecessary decompression operation when 
logValidator  do validation.)

> Remove 
> ---
>
> Key: KAFKA-8106
> URL: https://issues.apache.org/jira/browse/KAFKA-8106
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, core
>Affects Versions: 2.1.1
>Reporter: Flower.min
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8111) KafkaProducer can't produce data

2019-03-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793837#comment-16793837
 ] 

ASF GitHub Bot commented on KAFKA-8111:
---

rajinisivaram commented on pull request #6451: KAFKA-8111; Set min and max 
versions for Metadata requests
URL: https://github.com/apache/kafka/pull/6451
 
 
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> KafkaProducer can't produce data
> 
>
> Key: KAFKA-8111
> URL: https://issues.apache.org/jira/browse/KAFKA-8111
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, core
>Affects Versions: 2.3.0
>Reporter: John Roesler
>Assignee: Rajini Sivaram
>Priority: Major
>
> Using a Producer from the current trunk (a6691fb79), I'm unable to produce 
> data to a 2.2 broker.
> tl;dr;, I narrowed down the problem to 
> [https://github.com/apache/kafka/commit/a42f16f98] . My hypothesis is that 
> some part of that commit broke backward compatibility with older brokers.
>  
> Repro steps:
> I'm using this Producer config:
> {noformat}
> final Properties properties = new Properties();
> properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, BROKER);
> properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, 
> StringSerializer.class.getCanonicalName());
> properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, 
> StringSerializer.class.getCanonicalName());
> return properties;{noformat}
>  # create a simple Producer to produce test data to a broker
>  # build against commmit a42f16f98 
>  # start an older broker. (I was using 2.1, and someone else reproduced it 
> with 2.2)
>  # run your producer and note that it doesn't produce data (seems to hang, I 
> see it produce 2 records in 1 minute)
>  # build against the predecessor commit 65aea1f36
>  # run your producer and note that it DOES produce data (I see it produce 1M 
> records every 15 second)
> I've also confirmed that if I check out the current trunk (a6691fb79e2c55b3) 
> and revert a42f16f98, I also observe that it produces as expected (1M every 
> 15 seconds).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8106) Remove unnecessary decompression operation when logValidator do validation.

2019-03-15 Thread Flower.min (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flower.min updated KAFKA-8106:
--
Description: (was:        We do performance testing about kafka in 
specific scenarios as described below .We build a kafka cluster with one 
broker,and create topics with different number of partitions;then we start lots 
of producer processes to send large amounts of messages to one of the topics at 
one  testing .

*_specific scenario_*
 # *_server :_* cpu:2*16 ; MemTotal : 256G,Ethernet controller:Intel 
Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection ; SSD.
 # _*Topics :*_  topic1:50 partitions,topic2:100partitions,topic3:200 
partitions,...,2000 partiitons
 # _*size of Single Message* :_ 1024B

*_Config of KafkaProducer :_ __* **
 # _*compression.type*:_ lz4
 # _*linger.ms*:_ 1000ms/2000ms/5000ms
 # *_batch.size:_* _1_6384B/10240B/102400B
 # _*buffer.memory:*_ 134217728B

*_The best result of performance testing:_*
 # *_Pe_r*_*formance*:_2300 messages/s.
 # *_Resource usage:_* Network inflow rate : 550M/s~610MB/s,CPU(%) : 
97%~99%,Disk write speed:550M/s~610MB/s .

_*Phenomenon and  my doubt:*_

      _**      The upper limit of CPU usage has been reached  But  it does not 
reach the upper limit of the bandwidth of the server  network. We are doubtful 
about which  cost too much CPU time and we want to Improve  performance and 
reduces CPU usage of kafka server._

       _**_       

 

 

 )

> Remove unnecessary decompression operation when logValidator  do validation.
> 
>
> Key: KAFKA-8106
> URL: https://issues.apache.org/jira/browse/KAFKA-8106
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, core
>Affects Versions: 2.1.1
>Reporter: Flower.min
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-8111) KafkaProducer can't produce data

2019-03-15 Thread Rajini Sivaram (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram reassigned KAFKA-8111:
-

Assignee: Rajini Sivaram

> KafkaProducer can't produce data
> 
>
> Key: KAFKA-8111
> URL: https://issues.apache.org/jira/browse/KAFKA-8111
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, core
>Affects Versions: 2.3.0
>Reporter: John Roesler
>Assignee: Rajini Sivaram
>Priority: Major
>
> Using a Producer from the current trunk (a6691fb79), I'm unable to produce 
> data to a 2.2 broker.
> tl;dr;, I narrowed down the problem to 
> [https://github.com/apache/kafka/commit/a42f16f98] . My hypothesis is that 
> some part of that commit broke backward compatibility with older brokers.
>  
> Repro steps:
> I'm using this Producer config:
> {noformat}
> final Properties properties = new Properties();
> properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, BROKER);
> properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, 
> StringSerializer.class.getCanonicalName());
> properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, 
> StringSerializer.class.getCanonicalName());
> return properties;{noformat}
>  # create a simple Producer to produce test data to a broker
>  # build against commmit a42f16f98 
>  # start an older broker. (I was using 2.1, and someone else reproduced it 
> with 2.2)
>  # run your producer and note that it doesn't produce data (seems to hang, I 
> see it produce 2 records in 1 minute)
>  # build against the predecessor commit 65aea1f36
>  # run your producer and note that it DOES produce data (I see it produce 1M 
> records every 15 second)
> I've also confirmed that if I check out the current trunk (a6691fb79e2c55b3) 
> and revert a42f16f98, I also observe that it produces as expected (1M every 
> 15 seconds).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7965) Flaky Test ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup

2019-03-15 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793781#comment-16793781
 ] 

Matthias J. Sax commented on KAFKA-7965:


"Boxed error" here 
[https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk11/detail/kafka-trunk-jdk11/373/tests]

 

> Flaky Test 
> ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup
> 
>
> Key: KAFKA-7965
> URL: https://issues.apache.org/jira/browse/KAFKA-7965
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, unit tests
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Matthias J. Sax
>Assignee: Stanislav Kozlovski
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.3.0, 2.2.1
>
>
> To get stable nightly builds for `2.2` release, I create tickets for all 
> observed test failures.
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/21/]
> {quote}java.lang.AssertionError: Received 0, expected at least 68 at 
> org.junit.Assert.fail(Assert.java:88) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> kafka.api.ConsumerBounceTest.receiveAndCommit(ConsumerBounceTest.scala:557) 
> at 
> kafka.api.ConsumerBounceTest.$anonfun$testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup$1(ConsumerBounceTest.scala:320)
>  at 
> kafka.api.ConsumerBounceTest.$anonfun$testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup$1$adapted(ConsumerBounceTest.scala:319)
>  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at 
> kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup(ConsumerBounceTest.scala:319){quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8091) Flaky test DynamicBrokerReconfigurationTest#testAddRemoveSaslListener

2019-03-15 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793780#comment-16793780
 ] 

Matthias J. Sax commented on KAFKA-8091:


[~rsivaram] this test failed again: 
[https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3466/tests]
{quote}org.scalatest.junit.JUnitTestFailedError: Operation should not have 
completed
at 
org.scalatest.junit.AssertionsForJUnit.newAssertionFailedException(AssertionsForJUnit.scala:100)
at 
org.scalatest.junit.AssertionsForJUnit.newAssertionFailedException$(AssertionsForJUnit.scala:99)
at 
org.scalatest.junit.JUnitSuite.newAssertionFailedException(JUnitSuite.scala:71)
at org.scalatest.Assertions.fail(Assertions.scala:1089)
at org.scalatest.Assertions.fail$(Assertions.scala:1085)
at org.scalatest.junit.JUnitSuite.fail(JUnitSuite.scala:71)
at 
kafka.server.DynamicBrokerReconfigurationTest.verifyTimeout(DynamicBrokerReconfigurationTest.scala:1328)
at 
kafka.server.DynamicBrokerReconfigurationTest.verifyRemoveListener(DynamicBrokerReconfigurationTest.scala:981)
at 
kafka.server.DynamicBrokerReconfigurationTest.testAddRemoveSaslListeners(DynamicBrokerReconfigurationTest.scala:843){quote}
STDOUT
{quote}Completed Updating config for entity: brokers '0'.
Completed Updating config for entity: brokers '1'.
Completed Updating config for entity: brokers '2'.
[2019-03-15 05:51:46,395] ERROR [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=1] Error for partition testtopic-6 at offset 0 
(kafka.server.ReplicaFetcherThread:76)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
does not host this topic-partition.
[2019-03-15 05:51:46,395] ERROR [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=1] Error for partition testtopic-0 at offset 0 
(kafka.server.ReplicaFetcherThread:76)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
does not host this topic-partition.
Completed Updating config for entity: brokers '0'.
Completed Updating config for entity: brokers '1'.
Completed Updating config for entity: brokers '2'.
[2019-03-15 05:51:54,754] ERROR [ReplicaFetcher replicaId=2, leaderId=0, 
fetcherId=1] Error for partition testtopic-4 at offset 0 
(kafka.server.ReplicaFetcherThread:76)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
does not host this topic-partition.
Completed Updating config for entity: brokers '0'.
Completed Updating config for entity: brokers '1'.
Completed Updating config for entity: brokers '2'.
[2019-03-15 05:52:06,197] WARN Unable to reconnect to ZooKeeper service, 
session 0x10453e5652b0002 has expired (org.apache.zookeeper.ClientCnxn:1289)
Completed Updating config for entity: brokers '0'.
Completed Updating config for entity: brokers '1'.
Completed Updating config for entity: brokers '2'.
[2019-03-15 05:52:15,144] ERROR [ReplicaFetcher replicaId=0, leaderId=1, 
fetcherId=1] Error for partition testtopic-6 at offset 0 
(kafka.server.ReplicaFetcherThread:76)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
does not host this topic-partition.
[2019-03-15 05:52:15,144] ERROR [ReplicaFetcher replicaId=0, leaderId=1, 
fetcherId=1] Error for partition testtopic-0 at offset 0 
(kafka.server.ReplicaFetcherThread:76)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
does not host this topic-partition.
[2019-03-15 05:52:15,157] ERROR [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=0] Error for partition testtopic-7 at offset 0 
(kafka.server.ReplicaFetcherThread:76)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
does not host this topic-partition.
[2019-03-15 05:52:15,157] ERROR [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=0] Error for partition testtopic-1 at offset 0 
(kafka.server.ReplicaFetcherThread:76)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
does not host this topic-partition.
[2019-03-15 05:52:15,168] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
fetcherId=1] Error for partition testtopic-2 at offset 0 
(kafka.server.ReplicaFetcherThread:76)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
does not host this topic-partition.
[2019-03-15 05:52:15,168] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
fetcherId=0] Error for partition testtopic-5 at offset 0 
(kafka.server.ReplicaFetcherThread:76)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
does not host this topic-partition.
[2019-03-15 05:52:15,168] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
fetcherId=1] Error for partition testtopic-8 at offset 0 
(kafka.server.ReplicaFetcherThread:76)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
does not host this topic-partition.
[2019-03-15 05:52:15,174] ERROR [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=1] 

[jira] [Created] (KAFKA-8111) KafkaProducer can't produce data

2019-03-15 Thread John Roesler (JIRA)
John Roesler created KAFKA-8111:
---

 Summary: KafkaProducer can't produce data
 Key: KAFKA-8111
 URL: https://issues.apache.org/jira/browse/KAFKA-8111
 Project: Kafka
  Issue Type: Bug
  Components: clients, core
Affects Versions: 2.3.0
Reporter: John Roesler


Using a Producer from the current trunk (a6691fb79), I'm unable to produce data 
to a 2.2 broker.

tl;dr;, I narrowed down the problem to 
[https://github.com/apache/kafka/commit/a42f16f98] . My hypothesis is that some 
part of that commit broke backward compatibility with older brokers.

 

Repro steps:

I'm using this Producer config:
{noformat}
final Properties properties = new Properties();
properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, BROKER);
properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, 
StringSerializer.class.getCanonicalName());
properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, 
StringSerializer.class.getCanonicalName());
return properties;{noformat}
 # create a simple Producer to produce test data to a broker
 # build against commmit a42f16f98 
 # start an older broker. (I was using 2.1, and someone else reproduced it with 
2.2)
 # run your producer and note that it doesn't produce data (seems to hang, I 
see it produce 2 records in 1 minute)
 # build against the predecessor commit 65aea1f36
 # run your producer and note that it DOES produce data (I see it produce 1M 
records every 15 second)

I've also confirmed that if I check out the current trunk (a6691fb79e2c55b3) 
and revert a42f16f98, I also observe that it produces as expected (1M every 15 
seconds).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8110) Flaky Test DescribeConsumerGroupTest#testDescribeMembersWithConsumersWithoutAssignedPartitions

2019-03-15 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-8110:
--

 Summary: Flaky Test 
DescribeConsumerGroupTest#testDescribeMembersWithConsumersWithoutAssignedPartitions
 Key: KAFKA-8110
 URL: https://issues.apache.org/jira/browse/KAFKA-8110
 Project: Kafka
  Issue Type: Bug
  Components: core, unit tests
Affects Versions: 2.2.0
Reporter: Matthias J. Sax
 Fix For: 2.3.0, 2.2.1


[https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/67/testReport/junit/kafka.admin/DescribeConsumerGroupTest/testDescribeMembersWithConsumersWithoutAssignedPartitions/]
{quote}java.lang.AssertionError: Partition [__consumer_offsets,0] metadata not 
propagated after 15000 ms at kafka.utils.TestUtils$.fail(TestUtils.scala:381) 
at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at 
kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:880) at 
kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:318) at 
kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:317) at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at 
scala.collection.immutable.Range.foreach(Range.scala:158) at 
scala.collection.TraversableLike.map(TraversableLike.scala:237) at 
scala.collection.TraversableLike.map$(TraversableLike.scala:230) at 
scala.collection.AbstractTraversable.map(Traversable.scala:108) at 
kafka.utils.TestUtils$.createTopic(TestUtils.scala:317) at 
kafka.utils.TestUtils$.createOffsetsTopic(TestUtils.scala:375) at 
kafka.admin.DescribeConsumerGroupTest.testDescribeMembersWithConsumersWithoutAssignedPartitions(DescribeConsumerGroupTest.scala:372){quote}
STDOUT
{quote}[2019-03-14 20:01:52,347] WARN Ignoring unexpected runtime exception 
(org.apache.zookeeper.server.NIOServerCnxnFactory:236) 
java.nio.channels.CancelledKeyException at 
sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) at 
sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87) at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:205)
 at java.lang.Thread.run(Thread.java:748) TOPIC PARTITION CURRENT-OFFSET 
LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID foo 0 0 0 0 - - - TOPIC PARTITION 
CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID foo 0 0 0 0 - - - 
COORDINATOR (ID) ASSIGNMENT-STRATEGY STATE #MEMBERS localhost:44669 (0){quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7898) ERROR Caught unexpected throwable (org.apache.zookeeper.ClientCnxn)

2019-03-15 Thread Steven McDonald (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793755#comment-16793755
 ] 

Steven McDonald commented on KAFKA-7898:


{quote}I also consider it a bug that this NullPointerException leaves the Kafka 
cluster in a state that it does not recover from automatically.{quote}

For this issue, I have raised 
[ZOOKEEPER-3315|https://issues.apache.org/jira/projects/ZOOKEEPER/issues/ZOOKEEPER-3315].

> ERROR Caught unexpected throwable (org.apache.zookeeper.ClientCnxn)
> ---
>
> Key: KAFKA-7898
> URL: https://issues.apache.org/jira/browse/KAFKA-7898
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Gabriel Lukacs
>Priority: Major
>
> We observed a NullPointerException on one of our broker in 3 broker cluster 
> environment. If I list the processes and open ports it seems that the faulty 
> broker is running, but the kafka-connect (we used it also) periodically 
> restarts due to fact that it can not connect to the kafka cluster (configured 
> ssl & plaintext mode too). Is it a bug in kafka/zookeeper?
>  
> [2019-02-05 14:28:11,359] WARN Client session timed out, have not heard from 
> server in 4141ms for sessionid 0x310166e 
> (org.apache.zookeeper.ClientCnxn)
> [2019-02-05 14:28:12,525] ERROR Caught unexpected throwable 
> (org.apache.zookeeper.ClientCnxn)
> java.lang.NullPointerException
>  at 
> kafka.zookeeper.ZooKeeperClient$$anon$8.processResult(ZooKeeperClient.scala:217)
>  at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:633)
>  at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:508)
> [2019-02-05 14:28:12,526] ERROR Caught unexpected throwable 
> (org.apache.zookeeper.ClientCnxn)
> [2019-02-05 14:28:22,701] WARN Client session timed out, have not heard from 
> server in 4004ms for sessionid 0x310166e 
> (org.apache.zookeeper.ClientCnxn)
> [2019-02-05 14:28:28,670] WARN Client session timed out, have not heard from 
> server in 4049ms for sessionid 0x310166e 
> (org.apache.zookeeper.ClientCnxn)
> [2019-02-05 15:05:20,601] WARN [GroupCoordinator 1]: Failed to write empty 
> metadata for group 
> encodable-emvTokenAccess-delta-encoder-group-emvIssuerAccess-v2-2-0: The 
> group is rebalancing, so a rejoin is needed. 
> (kafka.coordinator.group.GroupCoordinator)
> kafka 7381 1 0 14:22 ? 00:00:19 java -Xmx512M -Xms512M -server -XX:+UseG1GC 
> -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 
> -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true 
> -Xloggc:/opt/kafka/bin/../logs/zookeeper-gc.log -verbose:gc 
> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M 
> -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dkafka.logs.dir=/opt/kafka/bin/../logs 
> -Dlog4j.configuration=file:/opt/kafka/config/zoo-log4j.properties -cp 
> 

[jira] [Comment Edited] (KAFKA-7027) Overloaded StreamsBuilder Build Method to Accept java.util.Properties

2019-03-15 Thread Bill Bejeck (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793734#comment-16793734
 ] 

Bill Bejeck edited comment on KAFKA-7027 at 3/15/19 4:08 PM:
-

cherry-picked [https://github.com/apache/kafka/pull/6373] to 2.2 and 2.1


was (Author: bbejeck):
cherry-picked [https://github.com/apache/kafka/pull/6373] tp 2.2 and 2.1

> Overloaded StreamsBuilder Build Method to Accept java.util.Properties
> -
>
> Key: KAFKA-7027
> URL: https://issues.apache.org/jira/browse/KAFKA-7027
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Bill Bejeck
>Assignee: Bill Bejeck
>Priority: Major
>  Labels: kip
> Fix For: 2.1.0
>
>
> Add overloaded method to {{StreamsBuilder}} accepting a 
> {{java.utils.Properties}} instance.
>  
> KIP can be found here 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-312%3A+Add+Overloaded+StreamsBuilder+Build+Method+to+Accept+java.util.Properties



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7027) Overloaded StreamsBuilder Build Method to Accept java.util.Properties

2019-03-15 Thread Bill Bejeck (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793734#comment-16793734
 ] 

Bill Bejeck commented on KAFKA-7027:


cherry-picked [https://github.com/apache/kafka/pull/6373] tp 2.2 and 2.1

> Overloaded StreamsBuilder Build Method to Accept java.util.Properties
> -
>
> Key: KAFKA-7027
> URL: https://issues.apache.org/jira/browse/KAFKA-7027
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Bill Bejeck
>Assignee: Bill Bejeck
>Priority: Major
>  Labels: kip
> Fix For: 2.1.0
>
>
> Add overloaded method to {{StreamsBuilder}} accepting a 
> {{java.utils.Properties}} instance.
>  
> KIP can be found here 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-312%3A+Add+Overloaded+StreamsBuilder+Build+Method+to+Accept+java.util.Properties



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8091) Flaky test DynamicBrokerReconfigurationTest#testAddRemoveSaslListener

2019-03-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793720#comment-16793720
 ] 

ASF GitHub Bot commented on KAFKA-8091:
---

rajinisivaram commented on pull request #6450: KAFKA-8091; Use commitSync to 
check connection failure in listener update test
URL: https://github.com/apache/kafka/pull/6450
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flaky test  DynamicBrokerReconfigurationTest#testAddRemoveSaslListener 
> ---
>
> Key: KAFKA-8091
> URL: https://issues.apache.org/jira/browse/KAFKA-8091
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.2.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 2.3.0, 2.2.1
>
>
> See KAFKA-6824 for details. Since the SSL version of the test is currently 
> skipped using @Ignore, fixing this for SASL first and wait for that to be 
> stable before re-enabling SSL tests under KAFKA-6824. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7027) Overloaded StreamsBuilder Build Method to Accept java.util.Properties

2019-03-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793688#comment-16793688
 ] 

ASF GitHub Bot commented on KAFKA-7027:
---

bbejeck commented on pull request #6373: KAFKA-7027: Add an overload build 
method in scala
URL: https://github.com/apache/kafka/pull/6373
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Overloaded StreamsBuilder Build Method to Accept java.util.Properties
> -
>
> Key: KAFKA-7027
> URL: https://issues.apache.org/jira/browse/KAFKA-7027
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Bill Bejeck
>Assignee: Bill Bejeck
>Priority: Major
>  Labels: kip
> Fix For: 2.1.0
>
>
> Add overloaded method to {{StreamsBuilder}} accepting a 
> {{java.utils.Properties}} instance.
>  
> KIP can be found here 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-312%3A+Add+Overloaded+StreamsBuilder+Build+Method+to+Accept+java.util.Properties



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7723) Kafka Connect support override worker kafka api configuration with connector configuration that post by rest api

2019-03-15 Thread Randall Hauch (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793635#comment-16793635
 ] 

Randall Hauch commented on KAFKA-7723:
--

[~laomei], first of all, thanks for logging this issue, creating 
[KIP-407|https://cwiki.apache.org/confluence/display/KAFKA/KIP-407%3A+Kafka+Connect+support+override+worker+kafka+api+configuration+with+connector+configuration+that+post+by+rest+api],
 and creating a pull request.

However, I think this is nearly identical to (or rather a subset of) KAFKA-6890 
/ 
[KIP-296|https://cwiki.apache.org/confluence/display/KAFKA/KIP-296%3A+Connector+level+configurability+for+client+configs],
 which IMO has the correct scope and where a discussion is taking place about 
the requirements and user experience. This proposal seems to differ from 
KAFKA-6890 / KIP-296 is that the approach proposed here only addresses 
connector configurations specified via the REST API, and not via configuration 
files passed to the standalone Connect worker. This would be a significant 
departure from the current behavior, where the REST API and file configurations 
are completely compatible. 

Since KAFKA-6890 / KIP-296 are older, can we resolve this issue as DUPLICATE, 
close the PR without merging, and withdraw KIP-407?

> Kafka Connect support override worker kafka api configuration with connector 
> configuration that post by rest api
> 
>
> Key: KAFKA-7723
> URL: https://issues.apache.org/jira/browse/KAFKA-7723
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: laomei
>Priority: Minor
>  Labels: needs-kip
>
> I'm using kafka sink connect; "auto.offset.reset" is set in 
> connect-distributed*.properties; 
> It works for all connector which in one worker; So the consumer will poll 
> records from latest or earliest; I can not control the auto.offset.reset in 
> connector configs post with rest api;
> So I think is necessary to override worker kafka api configs with connector 
> configs;  
> Like this
> {code:java}
>   {
> "name": "test",
> "config": {
> "consumer.auto.offset.reset": "latest",
> "consumer.xxx"
> "connector.class": "com.laomei.sis.solr.SolrConnector",
> "tasks.max": "1",
> "poll.interval.ms": "100",
> "connect.timeout.ms": "6",
> "topics": "test"
> }
>   }
> {code}
> We can override kafka consumer auto offset reset in sink connector;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7723) Kafka Connect support override worker kafka api configuration with connector configuration that post by rest api

2019-03-15 Thread Randall Hauch (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-7723:
-
Labels: needs-kip  (was: )

> Kafka Connect support override worker kafka api configuration with connector 
> configuration that post by rest api
> 
>
> Key: KAFKA-7723
> URL: https://issues.apache.org/jira/browse/KAFKA-7723
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: laomei
>Priority: Minor
>  Labels: needs-kip
>
> I'm using kafka sink connect; "auto.offset.reset" is set in 
> connect-distributed*.properties; 
> It works for all connector which in one worker; So the consumer will poll 
> records from latest or earliest; I can not control the auto.offset.reset in 
> connector configs post with rest api;
> So I think is necessary to override worker kafka api configs with connector 
> configs;  
> Like this
> {code:java}
>   {
> "name": "test",
> "config": {
> "consumer.auto.offset.reset": "latest",
> "consumer.xxx"
> "connector.class": "com.laomei.sis.solr.SolrConnector",
> "tasks.max": "1",
> "poll.interval.ms": "100",
> "connect.timeout.ms": "6",
> "topics": "test"
> }
>   }
> {code}
> We can override kafka consumer auto offset reset in sink connector;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-7898) ERROR Caught unexpected throwable (org.apache.zookeeper.ClientCnxn)

2019-03-15 Thread Steven McDonald (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793567#comment-16793567
 ] 

Steven McDonald edited comment on KAFKA-7898 at 3/15/19 11:54 AM:
--

Hi!

We encountered this as well and I did some investigation into the problem. It 
is a bug in Kafka 2.1.x that is fixed in 2.2.x (though not explicitly; the fix 
is the result of [a 
refactor|https://github.com/apache/kafka/commit/2155c6d54b087206b6aa1d58747f141761394eaf#diff-8bcd2c427556f434e33cf22abec548c2R217]).

The underlying problem is with the Zookeeper client library's MultiCallback 
interface. The 
[documentation|https://zookeeper.apache.org/doc/r3.4.13/api/org/apache/zookeeper/AsyncCallback.MultiCallback.html]
 for this says that "all opResults are OpResult.ErrorResult", but [some error 
conditions|https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L689]
 will pass the callback a null pointer in place of a list. Kafka 2.1.x is 
implemented according to the documentation, so the null pointer case is not 
handled, leading to this bug. I have [raised this 
issue|https://issues.apache.org/jira/projects/ZOOKEEPER/issues/ZOOKEEPER-3314] 
with Zookeeper.

I also consider it a bug that this NullPointerException leaves the Kafka 
cluster in a state that it does not recover from automatically. In our case, 
this bug was hit during a controller election, resulting in a node that was 
designated as controller but unable to function as such. It would be sufficient 
for this exception to simply kill the Kafka node so that the remaining nodes 
can recover, but I think that is a separate bug (which I will raise with 
Zookeeper first, as the exception is currently caught there).

I can provide additional information on our experience if it's of any interest, 
but since this is already fixed in Kafka 2.2.x I don't see much point expanding 
here.


was (Author: steven-usabilla):
Hi!

We encountered this as well and I did some investigation into the problem. It 
is a bug in Kafka 2.1.x that is fixed in 2.2.x (though not explicitly; the fix 
is the result of [a 
refactor|https://github.com/apache/kafka/commit/2155c6d54b087206b6aa1d58747f141761394eaf#diff-8bcd2c427556f434e33cf22abec548c2R217]).

The underlying problem is with the Zookeeper client library's MultiCallback 
interface. The 
[documentation|https://zookeeper.apache.org/doc/r3.4.13/api/org/apache/zookeeper/AsyncCallback.MultiCallback.html]
 for this says that "all opResults are OpResult.ErrorResult", but [some error 
conditions|https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L689]
 will pass the callback a null pointer in place of a list. Kafka 2.1.x is 
implemented according to the documentation, so the null pointer case is not 
handled, leading to this bug.

I also consider it a bug that this NullPointerException leaves the Kafka 
cluster in a state that it does not recover from automatically. In our case, 
this bug was hit during a controller election, resulting in a node that was 
designated as controller but unable to function as such. It would be sufficient 
for this exception to simply kill the Kafka node so that the remaining nodes 
can recover, but I think that is a separate bug (which I will raise with 
Zookeeper first, as the exception is currently caught there).

I can provide additional information on our experience if it's of any interest, 
but since this is already fixed in Kafka 2.2.x I don't see much point expanding 
here.

> ERROR Caught unexpected throwable (org.apache.zookeeper.ClientCnxn)
> ---
>
> Key: KAFKA-7898
> URL: https://issues.apache.org/jira/browse/KAFKA-7898
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Gabriel Lukacs
>Priority: Major
>
> We observed a NullPointerException on one of our broker in 3 broker cluster 
> environment. If I list the processes and open ports it seems that the faulty 
> broker is running, but the kafka-connect (we used it also) periodically 
> restarts due to fact that it can not connect to the kafka cluster (configured 
> ssl & plaintext mode too). Is it a bug in kafka/zookeeper?
>  
> [2019-02-05 14:28:11,359] WARN Client session timed out, have not heard from 
> server in 4141ms for sessionid 0x310166e 
> (org.apache.zookeeper.ClientCnxn)
> [2019-02-05 14:28:12,525] ERROR Caught unexpected throwable 
> (org.apache.zookeeper.ClientCnxn)
> java.lang.NullPointerException
>  at 
> kafka.zookeeper.ZooKeeperClient$$anon$8.processResult(ZooKeeperClient.scala:217)
>  at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:633)
>  at 

[jira] [Commented] (KAFKA-7898) ERROR Caught unexpected throwable (org.apache.zookeeper.ClientCnxn)

2019-03-15 Thread Steven McDonald (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793567#comment-16793567
 ] 

Steven McDonald commented on KAFKA-7898:


Hi!

We encountered this as well and I did some investigation into the problem. It 
is a bug in Kafka 2.1.x that is fixed in 2.2.x (though not explicitly; the fix 
is the result of [a 
refactor|https://github.com/apache/kafka/commit/2155c6d54b087206b6aa1d58747f141761394eaf#diff-8bcd2c427556f434e33cf22abec548c2R217]).

The underlying problem is with the Zookeeper client library's MultiCallback 
interface. The 
[documentation|https://zookeeper.apache.org/doc/r3.4.13/api/org/apache/zookeeper/AsyncCallback.MultiCallback.html]
 for this says that "all opResults are OpResult.ErrorResult", but [some error 
conditions|https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L689]
 will pass the callback a null pointer in place of a list. Kafka 2.1.x is 
implemented according to the documentation, so the null pointer case is not 
handled, leading to this bug.

I also consider it a bug that this NullPointerException leaves the Kafka 
cluster in a state that it does not recover from automatically. In our case, 
this bug was hit during a controller election, resulting in a node that was 
designated as controller but unable to function as such. It would be sufficient 
for this exception to simply kill the Kafka node so that the remaining nodes 
can recover, but I think that is a separate bug (which I will raise with 
Zookeeper first, as the exception is currently caught there).

I can provide additional information on our experience if it's of any interest, 
but since this is already fixed in Kafka 2.2.x I don't see much point expanding 
here.

> ERROR Caught unexpected throwable (org.apache.zookeeper.ClientCnxn)
> ---
>
> Key: KAFKA-7898
> URL: https://issues.apache.org/jira/browse/KAFKA-7898
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Gabriel Lukacs
>Priority: Major
>
> We observed a NullPointerException on one of our broker in 3 broker cluster 
> environment. If I list the processes and open ports it seems that the faulty 
> broker is running, but the kafka-connect (we used it also) periodically 
> restarts due to fact that it can not connect to the kafka cluster (configured 
> ssl & plaintext mode too). Is it a bug in kafka/zookeeper?
>  
> [2019-02-05 14:28:11,359] WARN Client session timed out, have not heard from 
> server in 4141ms for sessionid 0x310166e 
> (org.apache.zookeeper.ClientCnxn)
> [2019-02-05 14:28:12,525] ERROR Caught unexpected throwable 
> (org.apache.zookeeper.ClientCnxn)
> java.lang.NullPointerException
>  at 
> kafka.zookeeper.ZooKeeperClient$$anon$8.processResult(ZooKeeperClient.scala:217)
>  at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:633)
>  at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:508)
> [2019-02-05 14:28:12,526] ERROR Caught unexpected throwable 
> (org.apache.zookeeper.ClientCnxn)
> [2019-02-05 14:28:22,701] WARN Client session timed out, have not heard from 
> server in 4004ms for sessionid 0x310166e 
> (org.apache.zookeeper.ClientCnxn)
> [2019-02-05 14:28:28,670] WARN Client session timed out, have not heard from 
> server in 4049ms for sessionid 0x310166e 
> (org.apache.zookeeper.ClientCnxn)
> [2019-02-05 15:05:20,601] WARN [GroupCoordinator 1]: Failed to write empty 
> metadata for group 
> encodable-emvTokenAccess-delta-encoder-group-emvIssuerAccess-v2-2-0: The 
> group is rebalancing, so a rejoin is needed. 
> (kafka.coordinator.group.GroupCoordinator)
> kafka 7381 1 0 14:22 ? 00:00:19 java -Xmx512M -Xms512M -server -XX:+UseG1GC 
> -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 
> -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true 
> -Xloggc:/opt/kafka/bin/../logs/zookeeper-gc.log -verbose:gc 
> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M 
> -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dkafka.logs.dir=/opt/kafka/bin/../logs 
> -Dlog4j.configuration=file:/opt/kafka/config/zoo-log4j.properties -cp 
> 

[jira] [Created] (KAFKA-8109) Consumer with isolation level as 'read_committed' are getting stuck for few partitions

2019-03-15 Thread Love Singh (JIRA)
Love Singh created KAFKA-8109:
-

 Summary: Consumer with isolation level as 'read_committed' are 
getting stuck for few partitions
 Key: KAFKA-8109
 URL: https://issues.apache.org/jira/browse/KAFKA-8109
 Project: Kafka
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Love Singh


Hello , 

 

Consumers with isolation level set as 'read_committed' are getting stuck for 
few partitions in a topic , for other it is working fine .

Upon examination we have found out that the LSO(last stable offset) lag for 
those topic-partitions are more than 25K (JMX Metric : LastStableOffsetLag).

We can read for any offsets from these topic-partitions in read_commited before 
LSO but consumers gets stuck when it reaches LSO . READ_UNCOMMITED mode works 
fine .

We have seen below error repeatedly in our log for that partition :

_"Found no record of producerId on the broker. It is possible that the last 
message with the producerId has been removed due to hitting the retention 
limit."_

All the producers are transactional 

We are not sure what else to check here . Can some one please have a look .

 

Thanks .

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8091) Flaky test DynamicBrokerReconfigurationTest#testAddRemoveSaslListener

2019-03-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793473#comment-16793473
 ] 

ASF GitHub Bot commented on KAFKA-8091:
---

rajinisivaram commented on pull request #6450: KAFKA-8091; Use commitSync to 
check connection failure in listener update test
URL: https://github.com/apache/kafka/pull/6450
 
 
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flaky test  DynamicBrokerReconfigurationTest#testAddRemoveSaslListener 
> ---
>
> Key: KAFKA-8091
> URL: https://issues.apache.org/jira/browse/KAFKA-8091
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.2.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 2.3.0, 2.2.1
>
>
> See KAFKA-6824 for details. Since the SSL version of the test is currently 
> skipped using @Ignore, fixing this for SASL first and wait for that to be 
> stable before re-enabling SSL tests under KAFKA-6824. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7697) Possible deadlock in kafka.cluster.Partition

2019-03-15 Thread Ankit Singhal (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793469#comment-16793469
 ] 

Ankit Singhal commented on KAFKA-7697:
--

We also hit the same issue. Had to restart the broker after almost every 6 
hours!

[~jnadler] what is the issue with 2.1.1 ? We are planning to move to this 
version.. 

[~rsivaram] Shall we move to 2.0.1 since 2.1.1 is just released and we might 
hit other issues? 2.0.1 seems pretty stable!

> Possible deadlock in kafka.cluster.Partition
> 
>
> Key: KAFKA-7697
> URL: https://issues.apache.org/jira/browse/KAFKA-7697
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Gian Merlino
>Assignee: Rajini Sivaram
>Priority: Blocker
> Fix For: 2.2.0, 2.1.1
>
> Attachments: threaddump.txt
>
>
> After upgrading a fairly busy broker from 0.10.2.0 to 2.1.0, it locked up 
> within a few minutes (by "locked up" I mean that all request handler threads 
> were busy, and other brokers reported that they couldn't communicate with 
> it). I restarted it a few times and it did the same thing each time. After 
> downgrading to 0.10.2.0, the broker was stable. I attached a thread dump from 
> the last attempt on 2.1.0 that shows lots of kafka-request-handler- threads 
> trying to acquire the leaderIsrUpdateLock lock in kafka.cluster.Partition.
> It jumps out that there are two threads that already have some read lock 
> (can't tell which one) and are trying to acquire a second one (on two 
> different read locks: 0x000708184b88 and 0x00070821f188): 
> kafka-request-handler-1 and kafka-request-handler-4. Both are handling a 
> produce request, and in the process of doing so, are calling 
> Partition.fetchOffsetSnapshot while trying to complete a DelayedFetch. At the 
> same time, both of those locks have writers from other threads waiting on 
> them (kafka-request-handler-2 and kafka-scheduler-6). Neither of those locks 
> appear to have writers that hold them (if only because no threads in the dump 
> are deep enough in inWriteLock to indicate that).
> ReentrantReadWriteLock in nonfair mode prioritizes waiting writers over 
> readers. Is it possible that kafka-request-handler-1 and 
> kafka-request-handler-4 are each trying to read-lock the partition that is 
> currently locked by the other one, and they're both parked waiting for 
> kafka-request-handler-2 and kafka-scheduler-6 to get write locks, which they 
> never will, because the former two threads own read locks and aren't giving 
> them up?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6178) Broker is listed as only ISR for all partitions it is leader of

2019-03-15 Thread Narayan Periwal (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793461#comment-16793461
 ] 

Narayan Periwal commented on KAFKA-6178:


We are also seeing the same issue in our kafka cluster. We are using the 
version 0.10.2.1

 

> Broker is listed as only ISR for all partitions it is leader of
> ---
>
> Key: KAFKA-6178
> URL: https://issues.apache.org/jira/browse/KAFKA-6178
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.0
> Environment: Windows
>Reporter: AS
>Priority: Major
>  Labels: windows
> Attachments: KafkaServiceOutput.txt, log-cleaner.log, server.log
>
>
> We're running a 15 broker cluster on windows machines, and one of the 
> brokers, 10, is the only ISR on all partitions that it is the leader of. On 
> partitions where it isn't the leader, it seems to follow the leadeer fine. 
> This is an excerpt from 'describe':
> Topic: ClientQosCombined  Partition: 458  Leader: 10  Replicas: 
> 10,6,7,8,9,0,1   Isr: 10
> Topic: ClientQosCombined  Partition: 459  Leader: 11  Replicas: 
> 11,7,8,9,0,1,10 Isr: 0,10,1,9,7,11,8
> The server.log files all seem to be pretty standard, and the only indication 
> of this issue is the following pattern that often repeats:
> 2017-11-06 20:28:25,207 [INFO] kafka.cluster.Partition 
> [kafka-request-handler-8:] - Partition [ClientQosCombined,398] on broker 10: 
> Expanding ISR for partition [ClientQosCombined,398] from 10 to 5,10
> 2017-11-06 20:28:39,382 [INFO] kafka.cluster.Partition [kafka-scheduler-1:] - 
> Partition [ClientQosCombined,398] on broker 10: Shrinking ISR for partition 
> [ClientQosCombined,398] from 5,10 to 10
> For each of the partitions that 10 leads. This is the only topic that we 
> currently have in our cluster. The __consumer_offsets topic seems completely 
> normal in terms of isr counts. The controller is broker 5, which is cycling 
> through attempting and failing to trigger leader elections on broker 10 led 
> partitions. From the controller log in broker 5:
> 2017-11-06 20:45:04,857 [INFO] kafka.controller.KafkaController 
> [kafka-scheduler-0:] - [Controller 5]: Starting preferred replica leader 
> election for partitions [ClientQosCombined,375]
> 2017-11-06 20:45:04,857 [INFO] kafka.controller.PartitionStateMachine 
> [kafka-scheduler-0:] - [Partition state machine on Controller 5]: Invoking 
> state change to OnlinePartition for partitions [ClientQosCombined,375]
> 2017-11-06 20:45:04,857 [INFO] 
> kafka.controller.PreferredReplicaPartitionLeaderSelector [kafka-scheduler-0:] 
> - [PreferredReplicaPartitionLeaderSelector]: Current leader 10 for partition 
> [ClientQosCombined,375] is not the preferred replica. Trigerring preferred 
> replica leader election
> 2017-11-06 20:45:04,857 [WARN] kafka.controller.KafkaController 
> [kafka-scheduler-0:] - [Controller 5]: Partition [ClientQosCombined,375] 
> failed to complete preferred replica leader election. Leader is 10
> I've also attached the logs and output from broker 10. Any idea what's wrong 
> here? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)