[jira] [Commented] (KAFKA-8113) Flaky Test ListOffsetsRequestTest#testResponseIncludesLeaderEpoch

2019-06-11 Thread Boyang Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861750#comment-16861750
 ] 

Boyang Chen commented on KAFKA-8113:


Failed again 
[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/5459/console]

> Flaky Test ListOffsetsRequestTest#testResponseIncludesLeaderEpoch
> -
>
> Key: KAFKA-8113
> URL: https://issues.apache.org/jira/browse/KAFKA-8113
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3468/tests]
> {quote}java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:87)
> at org.junit.Assert.assertTrue(Assert.java:42)
> at org.junit.Assert.assertTrue(Assert.java:53)
> at 
> kafka.server.ListOffsetsRequestTest.fetchOffsetAndEpoch$1(ListOffsetsRequestTest.scala:136)
> at 
> kafka.server.ListOffsetsRequestTest.testResponseIncludesLeaderEpoch(ListOffsetsRequestTest.scala:151){quote}
> STDOUT
> {quote}[2019-03-15 17:16:13,029] ERROR [ReplicaFetcher replicaId=2, 
> leaderId=1, fetcherId=0] Error for partition topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-15 17:16:13,231] ERROR [KafkaApi-0] Error while responding to offset 
> request (kafka.server.KafkaApis:76)
> org.apache.kafka.common.errors.ReplicaNotAvailableException: Partition 
> topic-0 is not available{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-8525) Make log in Partion non-optional

2019-06-11 Thread Vikas Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh reassigned KAFKA-8525:
--

Assignee: Vikas Singh

> Make log in Partion non-optional
> 
>
> Key: KAFKA-8525
> URL: https://issues.apache.org/jira/browse/KAFKA-8525
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.3.0
>Reporter: Vikas Singh
>Assignee: Vikas Singh
>Priority: Minor
>
> While moving log out of replica to partition as part of KAFKA-8457 cleaned a 
> bunch of code by removing code like "if (!localReplica) throw), there are 
> still couple of additional cleanups that can be done:
>  # The log object in Partition can be made non-optional. As it doesn't make 
> sense to have a partition w/o log. Here is comment on PR for KAFKA-8457: 
> {{I think it shouldn't be possible to have a Partition without a 
> corresponding Log. Once this is merged, I think we can look into whether we 
> can replace the optional log field in this class with a concrete instance.}}
>  # The LocalReplica class can be removed simplifying replica class. Here is 
> another comment on the PR: {{it might be possible to turn Replica into a 
> trait and then let Log implement it directly. Then we could get rid of 
> LocalReplica. That would also help us clean up RemoteReplica, since the usage 
> of LogOffsetMetadata only makes sense for the local replica.}}
> Creating this JIRA to track these refactoring tasks for future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8262) Flaky Test MetricsIntegrationTest#testStreamMetric

2019-06-11 Thread Boyang Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861639#comment-16861639
 ] 

Boyang Chen commented on KAFKA-8262:


[https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/22647/console] Failed 
again

> Flaky Test MetricsIntegrationTest#testStreamMetric
> --
>
> Key: KAFKA-8262
> URL: https://issues.apache.org/jira/browse/KAFKA-8262
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Assignee: Bruno Cadonna
>Priority: Major
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3900/testReport/junit/org.apache.kafka.streams.integration/MetricsIntegrationTest/testStreamMetric/]
> {quote}java.lang.AssertionError: Condition not met within timeout 1. 
> testTaskMetric -> Size of metrics of type:'commit-latency-avg' must be equal 
> to:5 but it's equal to 0 expected:<5> but was:<0> at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:361) at 
> org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetric(MetricsIntegrationTest.java:228){quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8527) add dynamic maintenance broker config

2019-06-11 Thread xiongqi wu (JIRA)
xiongqi wu created KAFKA-8527:
-

 Summary: add dynamic maintenance broker config
 Key: KAFKA-8527
 URL: https://issues.apache.org/jira/browse/KAFKA-8527
 Project: Kafka
  Issue Type: Improvement
Reporter: xiongqi wu
Assignee: xiongqi wu


Before we remove a broker for maintenance, we want to remove all partitions out 
of the broker first to avoid introducing new Under Replicated Partitions (URPs) 
. That is because shutting down (or killing) a broker that still hosts live 
partitions will lead to temporarily reduced replicas of those partitions. 
Moving partitions out of a broker can be done via partition reassignment.  
However, during the partition reassignment process, new topics can be created 
by Kafka and thereby new partitions can be added to the broker that is pending 
for removal. As a result, the removal process will need to recursively moving 
new topic partitions out of the maintenance broker. In a production environment 
in which topic creation is frequent and URP causing by broker removal cannot be 
tolerated, the removal process can take multiple iterations to complete the 
partition reassignment.  We want to provide a mechanism to mask a broker as 
maintenance broker (Via Cluster Level Dynamic configuration). One action Kafka 
can take for the maintenance broker is not to assign new topic partitions to 
it, and thereby facilitate the broker removal.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8526) Broker may select a failed dir for new replica even in the presence of other live dirs

2019-06-11 Thread Anna Povzner (JIRA)
Anna Povzner created KAFKA-8526:
---

 Summary: Broker may select a failed dir for new replica even in 
the presence of other live dirs
 Key: KAFKA-8526
 URL: https://issues.apache.org/jira/browse/KAFKA-8526
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.2.1, 2.1.1, 2.0.1, 1.1.1, 2.3.0
Reporter: Anna Povzner


Suppose a broker is configured with multiple log dirs. One of the log dirs 
fails, but there is no load on that dir, so the broker does not know about the 
failure yet, _i.e._, the failed dir is still in LogManager#_liveLogDirs. 
Suppose a new topic gets created, and the controller chooses the broker with 
failed log dir to host one of the replicas. The broker gets LeaderAndIsr 
request with isNew flag set. LogManager#getOrCreateLog() selects a log dir for 
the new replica from _liveLogDirs, then one two things can happen:
1) getAbsolutePath can fail, in which case getOrCreateLog will throw an 
IOException
2) Creating directory for new the replica log may fail (_e.g._, if directory 
becomes read-only, so getAbsolutePath worked). 

In both cases, the selected dir will be marked offline (which is correct). 
However, LeaderAndIsr will return an error and replica will be marked offline, 
even though the broker may have other live dirs. 

*Proposed solution*: Broker should retry selecting a dir for the new replica, 
if initially selected dir threw an IOException when trying to create a 
directory for the new replica. We should be able to do that in 
LogManager#getOrCreateLog() method, but keep in mind that 
logDirFailureChannel.maybeAddOfflineLogDir does not synchronously removes the 
dir from _liveLogDirs. So, it makes sense to select initial dir by calling 
LogManager#nextLogDir (current implementation), but if we fail to create log on 
that dir, one approach is to select next dir from _liveLogDirs in round-robin 
fashion (until we get to initial log dir – the case where all dirs failed).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8305) AdminClient should support creating topics with default partitions and replication factor

2019-06-11 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8305:
---
Labels: kip  (was: )

> AdminClient should support creating topics with default partitions and 
> replication factor
> -
>
> Key: KAFKA-8305
> URL: https://issues.apache.org/jira/browse/KAFKA-8305
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin
>Reporter: Almog Gavra
>Assignee: Almog Gavra
>Priority: Major
>  Labels: kip
> Fix For: 2.4.0
>
>
> Today, the AdminClient creates topics by requiring a `NewTopic` object, which 
> must contain either partitions and replicas or an exact broker mapping (which 
> then infers partitions and replicas). Some users, however, could benefit from 
> just using the cluster default for replication factor but may not want to use 
> auto topic creation.
> NOTE: I am planning on working on this, but I do not have permissions to 
> assign this ticket to myself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8456) Flaky Test StoreUpgradeIntegrationTest#shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi

2019-06-11 Thread Boyang Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861525#comment-16861525
 ] 

Boyang Chen commented on KAFKA-8456:


[~mjsax] Any tip for debugging this?

> Flaky Test  
> StoreUpgradeIntegrationTest#shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi
> ---
>
> Key: KAFKA-8456
> URL: https://issues.apache.org/jira/browse/KAFKA-8456
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Reporter: Boyang Chen
>Priority: Major
>
> [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/22331/console]
> *01:20:07* 
> org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi
>  failed, log available in 
> /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk8-scala2.11/streams/build/reports/testOutput/org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi.test.stdout*01:20:07*
>  *01:20:07* org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest 
> > shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi 
> FAILED*01:20:07* java.lang.AssertionError: Condition not met within 
> timeout 15000. Could not get expected result in time.*01:20:07* at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:375)*01:20:07*
>  at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:335)*01:20:07*
>  at 
> org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.verifyWindowedCountWithTimestamp(StoreUpgradeIntegrationTest.java:830)*01:20:07*
>  at 
> org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigrateWindowStoreToTimestampedWindowStoreUsingPapi(StoreUpgradeIntegrationTest.java:573)*01:20:07*
>  at 
> org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi(StoreUpgradeIntegrationTest.java:517)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8391) Flaky Test RebalanceSourceConnectorsIntegrationTest#testDeleteConnector

2019-06-11 Thread Boyang Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861519#comment-16861519
 ] 

Boyang Chen commented on KAFKA-8391:


Failed again: 
[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/5415/console]

> Flaky Test RebalanceSourceConnectorsIntegrationTest#testDeleteConnector
> ---
>
> Key: KAFKA-8391
> URL: https://issues.apache.org/jira/browse/KAFKA-8391
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/4747/testReport/junit/org.apache.kafka.connect.integration/RebalanceSourceConnectorsIntegrationTest/testDeleteConnector/]
> {quote}java.lang.AssertionError: Condition not met within timeout 3. 
> Connector tasks did not stop in time. at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:375) at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:352) at 
> org.apache.kafka.connect.integration.RebalanceSourceConnectorsIntegrationTest.testDeleteConnector(RebalanceSourceConnectorsIntegrationTest.java:166){quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8525) Make log in Partion non-optional

2019-06-11 Thread Vikas Singh (JIRA)
Vikas Singh created KAFKA-8525:
--

 Summary: Make log in Partion non-optional
 Key: KAFKA-8525
 URL: https://issues.apache.org/jira/browse/KAFKA-8525
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.3.0
Reporter: Vikas Singh


While moving log out of replica to partition as part of KAFKA-8457 cleaned a 
bunch of code by removing code like "if (!localReplica) throw), there are still 
couple of additional cleanups that can be done:
 # The log object in Partition can be made non-optional. As it doesn't make 
sense to have a partition w/o log. Here is comment on PR for KAFKA-8457: 
{{I think it shouldn't be possible to have a Partition without a corresponding 
Log. Once this is merged, I think we can look into whether we can replace the 
optional log field in this class with a concrete instance.}}
 # The LocalReplica class can be removed simplifying replica class. Here is 
another comment on the PR: {{it might be possible to turn Replica into a trait 
and then let Log implement it directly. Then we could get rid of LocalReplica. 
That would also help us clean up RemoteReplica, since the usage of 
LogOffsetMetadata only makes sense for the local replica.}}

Creating this JIRA to track these refactoring tasks for future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8456) Flaky Test StoreUpgradeIntegrationTest#shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi

2019-06-11 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861337#comment-16861337
 ] 

Matthias J. Sax commented on KAFKA-8456:


One more: 
[https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3716/tests]

> Flaky Test  
> StoreUpgradeIntegrationTest#shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi
> ---
>
> Key: KAFKA-8456
> URL: https://issues.apache.org/jira/browse/KAFKA-8456
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Reporter: Boyang Chen
>Priority: Major
>
> [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/22331/console]
> *01:20:07* 
> org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi
>  failed, log available in 
> /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk8-scala2.11/streams/build/reports/testOutput/org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi.test.stdout*01:20:07*
>  *01:20:07* org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest 
> > shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi 
> FAILED*01:20:07* java.lang.AssertionError: Condition not met within 
> timeout 15000. Could not get expected result in time.*01:20:07* at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:375)*01:20:07*
>  at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:335)*01:20:07*
>  at 
> org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.verifyWindowedCountWithTimestamp(StoreUpgradeIntegrationTest.java:830)*01:20:07*
>  at 
> org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigrateWindowStoreToTimestampedWindowStoreUsingPapi(StoreUpgradeIntegrationTest.java:573)*01:20:07*
>  at 
> org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi(StoreUpgradeIntegrationTest.java:517)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-4893) async topic deletion conflicts with max topic length

2019-06-11 Thread Colin P. McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin P. McCabe updated KAFKA-4893:
---
Fix Version/s: 2.1.2

> async topic deletion conflicts with max topic length
> 
>
> Key: KAFKA-4893
> URL: https://issues.apache.org/jira/browse/KAFKA-4893
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Colin P. McCabe
>Priority: Major
> Fix For: 2.3.0, 2.1.2, 2.2.2
>
>
> As per the 
> [documentation|http://kafka.apache.org/documentation/#basic_ops_add_topic], 
> topics can be only 249 characters long to line up with typical filesystem 
> limitations:
> {quote}
> Each sharded partition log is placed into its own folder under the Kafka log 
> directory. The name of such folders consists of the topic name, appended by a 
> dash (\-) and the partition id. Since a typical folder name can not be over 
> 255 characters long, there will be a limitation on the length of topic names. 
> We assume the number of partitions will not ever be above 100,000. Therefore, 
> topic names cannot be longer than 249 characters. This leaves just enough 
> room in the folder name for a dash and a potentially 5 digit long partition 
> id.
> {quote}
> {{kafka.common.Topic.maxNameLength}} is set to 249 and is used during 
> validation.
> This limit ends up not being quite right since topic deletion ends up 
> renaming the directory to the form {{topic-partition.uniqueId-delete}} as can 
> be seen in {{LogManager.asyncDelete}}:
> {code}
> val dirName = new StringBuilder(removedLog.name)
>   .append(".")
>   
> .append(java.util.UUID.randomUUID.toString.replaceAll("-",""))
>   .append(Log.DeleteDirSuffix)
>   .toString()
> {code}
> So the unique id and "-delete" suffix end up hogging some of the characters. 
> Deleting a long-named topic results in a log message such as the following:
> {code}
> kafka.common.KafkaStorageException: Failed to rename log directory from 
> /tmp/kafka-logs0/0-0
>  to 
> /tmp/kafka-logs0/0-0.797bba3fb2464729840f87769243edbb-delete
>   at kafka.log.LogManager.asyncDelete(LogManager.scala:439)
>   at 
> kafka.cluster.Partition$$anonfun$delete$1.apply$mcV$sp(Partition.scala:142)
>   at kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:137)
>   at kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:137)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213)
>   at kafka.utils.CoreUtils$.inWriteLock(CoreUtils.scala:221)
>   at kafka.cluster.Partition.delete(Partition.scala:137)
>   at kafka.server.ReplicaManager.stopReplica(ReplicaManager.scala:230)
>   at 
> kafka.server.ReplicaManager$$anonfun$stopReplicas$2.apply(ReplicaManager.scala:260)
>   at 
> kafka.server.ReplicaManager$$anonfun$stopReplicas$2.apply(ReplicaManager.scala:259)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at kafka.server.ReplicaManager.stopReplicas(ReplicaManager.scala:259)
>   at kafka.server.KafkaApis.handleStopReplicaRequest(KafkaApis.scala:174)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:86)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:64)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> The topic after this point still exists but has Leader set to -1 and the 
> controller recognizes the topic completion as incomplete (the topic znode is 
> still in /admin/delete_topics).
> I don't believe linkedin has any topic name this long but I'm making the 
> ticket in case anyone runs into this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-5105) ReadOnlyKeyValueStore range scans are not ordered

2019-06-11 Thread Dmitry Minkovsky (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861256#comment-16861256
 ] 

Dmitry Minkovsky commented on KAFKA-5105:
-

Wondering what the latest on this is. This [blog 
post|https://danlebrero.com/2018/12/17/big-results-in-kafka-streams-range-query-rocksdb/]
 concludes that implementation-wise, a RockDB-backed store, with or without 
cache, will iterate lexicographically.

> ReadOnlyKeyValueStore range scans are not ordered
> -
>
> Key: KAFKA-5105
> URL: https://issues.apache.org/jira/browse/KAFKA-5105
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Dmitry Minkovsky
>Priority: Major
>
> Following up with this thread 
> https://www.mail-archive.com/users@kafka.apache.org/msg25373.html
> Although ReadOnlyKeyValueStore's #range() is documented not to returns values 
> in order, it would be great if it would for keys within a single partition. 
> This would facilitate using interactive queries and local state as one would 
> use HBase to index data by prefixed keys. If range returned keys in 
> lexicographical order, I could use interactive queries for all my data needs 
> except search.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-4893) async topic deletion conflicts with max topic length

2019-06-11 Thread Colin P. McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin P. McCabe updated KAFKA-4893:
---
Priority: Major  (was: Minor)

> async topic deletion conflicts with max topic length
> 
>
> Key: KAFKA-4893
> URL: https://issues.apache.org/jira/browse/KAFKA-4893
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Colin P. McCabe
>Priority: Major
> Fix For: 2.3.0, 2.2.2
>
>
> As per the 
> [documentation|http://kafka.apache.org/documentation/#basic_ops_add_topic], 
> topics can be only 249 characters long to line up with typical filesystem 
> limitations:
> {quote}
> Each sharded partition log is placed into its own folder under the Kafka log 
> directory. The name of such folders consists of the topic name, appended by a 
> dash (\-) and the partition id. Since a typical folder name can not be over 
> 255 characters long, there will be a limitation on the length of topic names. 
> We assume the number of partitions will not ever be above 100,000. Therefore, 
> topic names cannot be longer than 249 characters. This leaves just enough 
> room in the folder name for a dash and a potentially 5 digit long partition 
> id.
> {quote}
> {{kafka.common.Topic.maxNameLength}} is set to 249 and is used during 
> validation.
> This limit ends up not being quite right since topic deletion ends up 
> renaming the directory to the form {{topic-partition.uniqueId-delete}} as can 
> be seen in {{LogManager.asyncDelete}}:
> {code}
> val dirName = new StringBuilder(removedLog.name)
>   .append(".")
>   
> .append(java.util.UUID.randomUUID.toString.replaceAll("-",""))
>   .append(Log.DeleteDirSuffix)
>   .toString()
> {code}
> So the unique id and "-delete" suffix end up hogging some of the characters. 
> Deleting a long-named topic results in a log message such as the following:
> {code}
> kafka.common.KafkaStorageException: Failed to rename log directory from 
> /tmp/kafka-logs0/0-0
>  to 
> /tmp/kafka-logs0/0-0.797bba3fb2464729840f87769243edbb-delete
>   at kafka.log.LogManager.asyncDelete(LogManager.scala:439)
>   at 
> kafka.cluster.Partition$$anonfun$delete$1.apply$mcV$sp(Partition.scala:142)
>   at kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:137)
>   at kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:137)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213)
>   at kafka.utils.CoreUtils$.inWriteLock(CoreUtils.scala:221)
>   at kafka.cluster.Partition.delete(Partition.scala:137)
>   at kafka.server.ReplicaManager.stopReplica(ReplicaManager.scala:230)
>   at 
> kafka.server.ReplicaManager$$anonfun$stopReplicas$2.apply(ReplicaManager.scala:260)
>   at 
> kafka.server.ReplicaManager$$anonfun$stopReplicas$2.apply(ReplicaManager.scala:259)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at kafka.server.ReplicaManager.stopReplicas(ReplicaManager.scala:259)
>   at kafka.server.KafkaApis.handleStopReplicaRequest(KafkaApis.scala:174)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:86)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:64)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> The topic after this point still exists but has Leader set to -1 and the 
> controller recognizes the topic completion as incomplete (the topic znode is 
> still in /admin/delete_topics).
> I don't believe linkedin has any topic name this long but I'm making the 
> ticket in case anyone runs into this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-4893) async topic deletion conflicts with max topic length

2019-06-11 Thread Colin P. McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin P. McCabe updated KAFKA-4893:
---
Fix Version/s: 2.2.2

> async topic deletion conflicts with max topic length
> 
>
> Key: KAFKA-4893
> URL: https://issues.apache.org/jira/browse/KAFKA-4893
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Colin P. McCabe
>Priority: Minor
> Fix For: 2.3.0, 2.2.2
>
>
> As per the 
> [documentation|http://kafka.apache.org/documentation/#basic_ops_add_topic], 
> topics can be only 249 characters long to line up with typical filesystem 
> limitations:
> {quote}
> Each sharded partition log is placed into its own folder under the Kafka log 
> directory. The name of such folders consists of the topic name, appended by a 
> dash (\-) and the partition id. Since a typical folder name can not be over 
> 255 characters long, there will be a limitation on the length of topic names. 
> We assume the number of partitions will not ever be above 100,000. Therefore, 
> topic names cannot be longer than 249 characters. This leaves just enough 
> room in the folder name for a dash and a potentially 5 digit long partition 
> id.
> {quote}
> {{kafka.common.Topic.maxNameLength}} is set to 249 and is used during 
> validation.
> This limit ends up not being quite right since topic deletion ends up 
> renaming the directory to the form {{topic-partition.uniqueId-delete}} as can 
> be seen in {{LogManager.asyncDelete}}:
> {code}
> val dirName = new StringBuilder(removedLog.name)
>   .append(".")
>   
> .append(java.util.UUID.randomUUID.toString.replaceAll("-",""))
>   .append(Log.DeleteDirSuffix)
>   .toString()
> {code}
> So the unique id and "-delete" suffix end up hogging some of the characters. 
> Deleting a long-named topic results in a log message such as the following:
> {code}
> kafka.common.KafkaStorageException: Failed to rename log directory from 
> /tmp/kafka-logs0/0-0
>  to 
> /tmp/kafka-logs0/0-0.797bba3fb2464729840f87769243edbb-delete
>   at kafka.log.LogManager.asyncDelete(LogManager.scala:439)
>   at 
> kafka.cluster.Partition$$anonfun$delete$1.apply$mcV$sp(Partition.scala:142)
>   at kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:137)
>   at kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:137)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213)
>   at kafka.utils.CoreUtils$.inWriteLock(CoreUtils.scala:221)
>   at kafka.cluster.Partition.delete(Partition.scala:137)
>   at kafka.server.ReplicaManager.stopReplica(ReplicaManager.scala:230)
>   at 
> kafka.server.ReplicaManager$$anonfun$stopReplicas$2.apply(ReplicaManager.scala:260)
>   at 
> kafka.server.ReplicaManager$$anonfun$stopReplicas$2.apply(ReplicaManager.scala:259)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at kafka.server.ReplicaManager.stopReplicas(ReplicaManager.scala:259)
>   at kafka.server.KafkaApis.handleStopReplicaRequest(KafkaApis.scala:174)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:86)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:64)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> The topic after this point still exists but has Leader set to -1 and the 
> controller recognizes the topic completion as incomplete (the topic znode is 
> still in /admin/delete_topics).
> I don't believe linkedin has any topic name this long but I'm making the 
> ticket in case anyone runs into this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8487) Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit response handler

2019-06-11 Thread Guozhang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-8487:
-
Priority: Major  (was: Blocker)

> Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit 
> response handler
> -
>
> Key: KAFKA-8487
> URL: https://issues.apache.org/jira/browse/KAFKA-8487
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, streams
>Affects Versions: 2.3.0
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>Priority: Major
> Fix For: 2.4.0
>
>
> In consumer, we handle the errors in sync / heartbeat / join response such 
> that:
> 1. UNKNOWN_MEMBER_ID / ILLEGAL_GENERATION: we reset the generation and 
> request re-join.
> 2. REBALANCE_IN_PROGRESS: do nothing if a rejoin will be executed, or request 
> re-join explicitly.
> However, for commit response, we require resetGeneration for 
> REBALANCE_IN_PROGRESS as well. This is a flaw in two folds:
> 1. As in KIP-345, with static members, reseting generation will lose the 
> member.id and hence may cause incorrect fencing.
> 2. As in KIP-429, resetting generation will cause partitions to be "lost" 
> unnecessarily before re-joining the group. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8487) Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit response handler

2019-06-11 Thread Guozhang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-8487:
-
Affects Version/s: (was: 2.3.0)

> Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit 
> response handler
> -
>
> Key: KAFKA-8487
> URL: https://issues.apache.org/jira/browse/KAFKA-8487
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, streams
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>Priority: Major
> Fix For: 2.4.0
>
>
> In consumer, we handle the errors in sync / heartbeat / join response such 
> that:
> 1. UNKNOWN_MEMBER_ID / ILLEGAL_GENERATION: we reset the generation and 
> request re-join.
> 2. REBALANCE_IN_PROGRESS: do nothing if a rejoin will be executed, or request 
> re-join explicitly.
> However, for commit response, we require resetGeneration for 
> REBALANCE_IN_PROGRESS as well. This is a flaw in two folds:
> 1. As in KIP-345, with static members, reseting generation will lose the 
> member.id and hence may cause incorrect fencing.
> 2. As in KIP-429, resetting generation will cause partitions to be "lost" 
> unnecessarily before re-joining the group. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8487) Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit response handler

2019-06-11 Thread Guozhang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-8487.
--
   Resolution: Fixed
Fix Version/s: 2.4.0

> Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit 
> response handler
> -
>
> Key: KAFKA-8487
> URL: https://issues.apache.org/jira/browse/KAFKA-8487
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, streams
>Affects Versions: 2.3.0
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>Priority: Blocker
> Fix For: 2.4.0
>
>
> In consumer, we handle the errors in sync / heartbeat / join response such 
> that:
> 1. UNKNOWN_MEMBER_ID / ILLEGAL_GENERATION: we reset the generation and 
> request re-join.
> 2. REBALANCE_IN_PROGRESS: do nothing if a rejoin will be executed, or request 
> re-join explicitly.
> However, for commit response, we require resetGeneration for 
> REBALANCE_IN_PROGRESS as well. This is a flaw in two folds:
> 1. As in KIP-345, with static members, reseting generation will lose the 
> member.id and hence may cause incorrect fencing.
> 2. As in KIP-429, resetting generation will cause partitions to be "lost" 
> unnecessarily before re-joining the group. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8487) Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit response handler

2019-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861208#comment-16861208
 ] 

ASF GitHub Bot commented on KAFKA-8487:
---

guozhangwang commented on pull request #6894: KAFKA-8487: Only request re-join 
on REBALANCE_IN_PROGRESS in CommitOffsetResponse
URL: https://github.com/apache/kafka/pull/6894
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit 
> response handler
> -
>
> Key: KAFKA-8487
> URL: https://issues.apache.org/jira/browse/KAFKA-8487
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, streams
>Affects Versions: 2.3.0
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>Priority: Blocker
>
> In consumer, we handle the errors in sync / heartbeat / join response such 
> that:
> 1. UNKNOWN_MEMBER_ID / ILLEGAL_GENERATION: we reset the generation and 
> request re-join.
> 2. REBALANCE_IN_PROGRESS: do nothing if a rejoin will be executed, or request 
> re-join explicitly.
> However, for commit response, we require resetGeneration for 
> REBALANCE_IN_PROGRESS as well. This is a flaw in two folds:
> 1. As in KIP-345, with static members, reseting generation will lose the 
> member.id and hence may cause incorrect fencing.
> 2. As in KIP-429, resetting generation will cause partitions to be "lost" 
> unnecessarily before re-joining the group. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7469) Broker keeps group rebalance after adding FS

2019-06-11 Thread Gwen Shapira (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira resolved KAFKA-7469.
-
Resolution: Fixed

> Broker keeps group rebalance after adding FS 
> -
>
> Key: KAFKA-7469
> URL: https://issues.apache.org/jira/browse/KAFKA-7469
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
>Reporter: Boaz
>Priority: Major
> Fix For: 0.8.1.2
>
>
> Hi,
> I'm using a kafka_2.10-0.10.1.1 with 3 brokers cluster.
> A few days ago, we started running out of FS and our System Admin allocated 
> some more Disc space. After the allocation, we started experiencing high lags 
> on the consumers which kept growing. 
> On the Consumer side, we saw that no data is being consumed and the following 
> message keep coming in the log files:
> o.a.k.c.c.internals.AbstractCoordinator : (Re-)joining group 
> AutoPaymentSyncGroup
>  
> On the Broker logs, we keep seeing seeing messages of restabilize group :
> [2018-09-27 18:38:16,264] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group AutoPaymentActivityGroup with old generation 357 
> (kafka.coordinator.GroupCoordinator)
> [2018-09-27 18:38:16,264] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group AutoPaymentCreditCardTypeGroup with old generation 278 
> (kafka.coordinator.GroupCoordinator)
> [2018-09-27 18:38:16,284] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group AutoPaymentAuthChannelGroup with old generation 349 
> (kafka.coordinator.GroupCoordinator)
> [2018-09-27 18:38:16,411] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group AutoPaymentAuthCodeGroup with old generation 284 
> (kafka.coordinator.GroupCoordinator)
> [2018-09-27 18:38:16,463] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group AutoPaymentInteractionSyncGroup with old generation 359 
> (kafka.coordinator.GroupCoordinator)
> [2018-09-27 18:38:16,464] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group AutoPaymentSyncGroup with old generation 358 
> (kafka.coordinator.GroupCoordinator)
>  
> After bouncing the broker, the issue was resolved.
> Thanks,
> Boaz.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8524) Zookeeper Acl Sensitive Path Extension

2019-06-11 Thread sebastien diaz (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sebastien diaz updated KAFKA-8524:
--
Labels: path zkcli zookeeper  (was: )

> Zookeeper Acl Sensitive Path Extension
> --
>
> Key: KAFKA-8524
> URL: https://issues.apache.org/jira/browse/KAFKA-8524
> Project: Kafka
>  Issue Type: Bug
>  Components: zkclient
>Affects Versions: 1.1.0, 2.2.1
>Reporter: sebastien diaz
>Priority: Major
>  Labels: path, zkcli, zookeeper
>
> There is too more readable config in Zookeeper as /brokers,/controller, 
> /kafka-acl, .
> As Zookeeper can be accessed by other projects/users , the security should be 
> extended to Zookeeper ACL properly.
> We shoudl have the possibility to set these paths by configuration and not 
> (as it is today) in the code.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8524) Zookeeper Acl Sensitive Path Extension

2019-06-11 Thread sebastien diaz (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sebastien diaz updated KAFKA-8524:
--
Component/s: zkclient

> Zookeeper Acl Sensitive Path Extension
> --
>
> Key: KAFKA-8524
> URL: https://issues.apache.org/jira/browse/KAFKA-8524
> Project: Kafka
>  Issue Type: Bug
>  Components: zkclient
>Affects Versions: 1.1.0, 2.2.1
>Reporter: sebastien diaz
>Priority: Major
>
> There is too more readable config in Zookeeper as /brokers,/controller, 
> /kafka-acl, .
> As Zookeeper can be accessed by other projects/users , the security should be 
> extended to Zookeeper ACL properly.
> We shoudl have the possibility to set these paths by configuration and not 
> (as it is today) in the code.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8524) Zookeeper Acl Sensitive Path

2019-06-11 Thread sebastien diaz (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sebastien diaz updated KAFKA-8524:
--
Description: 
There is too more readable config in Zookeeper as /brokers,/controller, 
/kafka-acl, .

As Zookeeper can be accessed by other projects/users , the security should be 
extended to Zookeeper ACL properly.

We shoudl have the possibility to set these paths by configuration and not (as 
it is today) in the code.

 

 

 

> Zookeeper Acl Sensitive Path
> 
>
> Key: KAFKA-8524
> URL: https://issues.apache.org/jira/browse/KAFKA-8524
> Project: Kafka
>  Issue Type: Bug
>Reporter: sebastien diaz
>Priority: Major
>
> There is too more readable config in Zookeeper as /brokers,/controller, 
> /kafka-acl, .
> As Zookeeper can be accessed by other projects/users , the security should be 
> extended to Zookeeper ACL properly.
> We shoudl have the possibility to set these paths by configuration and not 
> (as it is today) in the code.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8524) Zookeeper Acl Sensitive Path Extension

2019-06-11 Thread sebastien diaz (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sebastien diaz updated KAFKA-8524:
--
Summary: Zookeeper Acl Sensitive Path Extension  (was: Zookeeper Acl 
Sensitive Path)

> Zookeeper Acl Sensitive Path Extension
> --
>
> Key: KAFKA-8524
> URL: https://issues.apache.org/jira/browse/KAFKA-8524
> Project: Kafka
>  Issue Type: Bug
>Reporter: sebastien diaz
>Priority: Major
>
> There is too more readable config in Zookeeper as /brokers,/controller, 
> /kafka-acl, .
> As Zookeeper can be accessed by other projects/users , the security should be 
> extended to Zookeeper ACL properly.
> We shoudl have the possibility to set these paths by configuration and not 
> (as it is today) in the code.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8524) Zookeeper Acl Sensitive Path Extension

2019-06-11 Thread sebastien diaz (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sebastien diaz updated KAFKA-8524:
--
Affects Version/s: 1.1.0
   2.2.1

> Zookeeper Acl Sensitive Path Extension
> --
>
> Key: KAFKA-8524
> URL: https://issues.apache.org/jira/browse/KAFKA-8524
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.2.1
>Reporter: sebastien diaz
>Priority: Major
>
> There is too more readable config in Zookeeper as /brokers,/controller, 
> /kafka-acl, .
> As Zookeeper can be accessed by other projects/users , the security should be 
> extended to Zookeeper ACL properly.
> We shoudl have the possibility to set these paths by configuration and not 
> (as it is today) in the code.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8524) Zookeeper Acl Sensitive Path

2019-06-11 Thread sebastien diaz (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sebastien diaz updated KAFKA-8524:
--
Summary: Zookeeper Acl Sensitive Path  (was: Zookeeper Acl )

> Zookeeper Acl Sensitive Path
> 
>
> Key: KAFKA-8524
> URL: https://issues.apache.org/jira/browse/KAFKA-8524
> Project: Kafka
>  Issue Type: Bug
>Reporter: sebastien diaz
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8524) Zookeeper Acl

2019-06-11 Thread sebastien diaz (JIRA)
sebastien diaz created KAFKA-8524:
-

 Summary: Zookeeper Acl 
 Key: KAFKA-8524
 URL: https://issues.apache.org/jira/browse/KAFKA-8524
 Project: Kafka
  Issue Type: Bug
Reporter: sebastien diaz






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8456) Flaky Test StoreUpgradeIntegrationTest#shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi

2019-06-11 Thread Boyang Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861155#comment-16861155
 ] 

Boyang Chen commented on KAFKA-8456:


Failed again: 
[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/5432/consoleFull]

> Flaky Test  
> StoreUpgradeIntegrationTest#shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi
> ---
>
> Key: KAFKA-8456
> URL: https://issues.apache.org/jira/browse/KAFKA-8456
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Reporter: Boyang Chen
>Priority: Major
>
> [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/22331/console]
> *01:20:07* 
> org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi
>  failed, log available in 
> /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk8-scala2.11/streams/build/reports/testOutput/org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi.test.stdout*01:20:07*
>  *01:20:07* org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest 
> > shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi 
> FAILED*01:20:07* java.lang.AssertionError: Condition not met within 
> timeout 15000. Could not get expected result in time.*01:20:07* at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:375)*01:20:07*
>  at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:335)*01:20:07*
>  at 
> org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.verifyWindowedCountWithTimestamp(StoreUpgradeIntegrationTest.java:830)*01:20:07*
>  at 
> org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigrateWindowStoreToTimestampedWindowStoreUsingPapi(StoreUpgradeIntegrationTest.java:573)*01:20:07*
>  at 
> org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi(StoreUpgradeIntegrationTest.java:517)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8502) Flakey test AdminClientIntegrationTest#testElectUncleanLeadersForAllPartitions

2019-06-11 Thread Boyang Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861154#comment-16861154
 ] 

Boyang Chen commented on KAFKA-8502:


[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/5432/consoleFull]

Failed again

> Flakey test AdminClientIntegrationTest#testElectUncleanLeadersForAllPartitions
> --
>
> Key: KAFKA-8502
> URL: https://issues.apache.org/jira/browse/KAFKA-8502
> Project: Kafka
>  Issue Type: Bug
>Reporter: Boyang Chen
>Priority: Major
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/5355/consoleFull]
>  
> *18:06:01* *18:06:01* kafka.api.AdminClientIntegrationTest > 
> testElectUncleanLeadersForAllPartitions FAILED*18:06:01* 
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.TimeoutException: Aborted due to 
> timeout.*18:06:01* at 
> org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)*18:06:01*
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)*18:06:01*
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)*18:06:01*
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)*18:06:01*
>  at 
> kafka.api.AdminClientIntegrationTest.testElectUncleanLeadersForAllPartitions(AdminClientIntegrationTest.scala:1496)*18:06:01*
>  *18:06:01* Caused by:*18:06:01* 
> org.apache.kafka.common.errors.TimeoutException: Aborted due to timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8501) Remove key and value from exception message

2019-06-11 Thread Bill Bejeck (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck resolved KAFKA-8501.

Resolution: Fixed

> Remove key and value from exception message
> ---
>
> Key: KAFKA-8501
> URL: https://issues.apache.org/jira/browse/KAFKA-8501
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Badai Aqrandista
>Assignee: Carlos Manuel Duclos Vergara
>Priority: Major
>  Labels: easy-fix, newbie
> Fix For: 2.4.0
>
>
> KAFKA-7510 moves the WARN messages that contain key and value to TRACE level. 
> But the exceptions still contain key and value. These are the two in 
> RecordCollectorImpl:
>  
> [https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordCollectorImpl.java#L185-L196]
>  
> [https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordCollectorImpl.java#L243-L254]
>  
> Can these be modified as well to remove key and value from the error message, 
> which we don't know what log level it will be printed in?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8501) Remove key and value from exception message

2019-06-11 Thread Bill Bejeck (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck updated KAFKA-8501:
---
Fix Version/s: 2.4.0

> Remove key and value from exception message
> ---
>
> Key: KAFKA-8501
> URL: https://issues.apache.org/jira/browse/KAFKA-8501
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Badai Aqrandista
>Assignee: Carlos Manuel Duclos Vergara
>Priority: Major
>  Labels: easy-fix, newbie
> Fix For: 2.4.0
>
>
> KAFKA-7510 moves the WARN messages that contain key and value to TRACE level. 
> But the exceptions still contain key and value. These are the two in 
> RecordCollectorImpl:
>  
> [https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordCollectorImpl.java#L185-L196]
>  
> [https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordCollectorImpl.java#L243-L254]
>  
> Can these be modified as well to remove key and value from the error message, 
> which we don't know what log level it will be printed in?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8501) Remove key and value from exception message

2019-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861073#comment-16861073
 ] 

ASF GitHub Bot commented on KAFKA-8501:
---

bbejeck commented on pull request #6904: KAFKA-8501: Removing key and value 
from exception message
URL: https://github.com/apache/kafka/pull/6904
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Remove key and value from exception message
> ---
>
> Key: KAFKA-8501
> URL: https://issues.apache.org/jira/browse/KAFKA-8501
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Badai Aqrandista
>Assignee: Carlos Manuel Duclos Vergara
>Priority: Major
>  Labels: easy-fix, newbie
>
> KAFKA-7510 moves the WARN messages that contain key and value to TRACE level. 
> But the exceptions still contain key and value. These are the two in 
> RecordCollectorImpl:
>  
> [https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordCollectorImpl.java#L185-L196]
>  
> [https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordCollectorImpl.java#L243-L254]
>  
> Can these be modified as well to remove key and value from the error message, 
> which we don't know what log level it will be printed in?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8334) Occasional OffsetCommit Timeout

2019-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861023#comment-16861023
 ] 

ASF GitHub Bot commented on KAFKA-8334:
---

windkit commented on pull request #6915: KAFKA-8334 Executor to retry delayed 
operations failed to obtain lock
URL: https://github.com/apache/kafka/pull/6915
 
 
   **ASF**: https://issues.apache.org/jira/browse/KAFKA-8334
   
   ### Brief Summary:
   We have seen `OffsetCommit` timed out when we do manual offset commit with 
low traffic.
   When we append the offset commit to the topic `__consumer_offsets`, the 
`DelayedProduce` would need to obtain the group metadata in order to complete.
   If the group metadata is obtained by others (such as `HeartBeat`), it would 
fail and it would only be retried when there is a next `OffsetCommit`
   
   ### Reproduce
    Methodology
   1. DelayedProduce on __consumer_offsets could not be completed if the 
group.lock is acquired by others
   2. We spam requests like Heartbeat to keep acquiring group.lock
   3. We keep sending OffsetCommit and check the processing time
   
    Reproduce Script
   https://gist.github.com/windkit/3384bb86dc146111d1e0857e66b85861
   - jammer.py - join the group "wilson-tester" and keep spamming Heartbeat
   - tester.py - fetch one message and do a long processing (or sleep) and then 
commit the offset
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Occasional OffsetCommit Timeout
> ---
>
> Key: KAFKA-8334
> URL: https://issues.apache.org/jira/browse/KAFKA-8334
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.1
>Reporter: windkithk
>Priority: Major
> Attachments: kafka8334.patch, offsetcommit_p99.9_cut.jpg
>
>
> h2. 1) Issue Summary
> Since we have upgraded to 1.1, we have observed occasional OffsetCommit 
> timeouts from clients
> {code:java}
> Offset commit failed on partition sometopic-somepartition at offset 
> someoffset: The request timed out{code}
> Normally OffsetCommit completes within 10ms but when we check the 99.9 
> percentile, we see the request duration time jumps up to 5000 ms 
> (offsets.commit.timeout.ms)
> Here is a screenshot of prometheus recording 
> kafka_network_request_duration_milliseconds
> (offsetcommit_p99.9_cut.jpg)
> and after checking the duration breakdown, most of the time was spent on 
> "Remote" Scope
> (Below is a request log line produced by inhouse slow request logger
> {code:java}
> [2019-04-16 13:06:20,339] WARN Slow 
> response:RequestHeader(apiKey=OFFSET_COMMIT, apiVersion=2, 
> clientId=kafka-python-1.4.6, correlationId=953) -- 
> {group_id=wilson-tester,generation_id=28,member_id=kafka-python-1.4.6-69ed979d-a069-4c6d-9862-e4fc34883269,retention_time=-1,topics=[{topic=test,partitions=[{partition=2,offset=63456,metadata=null}]}]}
>  from 
> connection;totalTime:5001.942000,requestQueueTime:0.03,localTime:0.574000,remoteTime:5001.173000,responseQueueTime:0.058000,sendTime:0.053000,securityProtocol:PLAINTEXT,principal:User:ANONYMOUS
>  (kafka.request.logger)
> {code}
> h2. 2) What got changed in 1.1 from 0.10.2.1?
> # Log Level Changed
> In 1.1 Kafka Consumer, logging about timed out OffsetCommit is changed from 
> DEBUG to WARN
> # Group Lock is acquired when trying to complete DelayedProduce of 
> OffsetCommit
> This was added after 0.11.0.2
> (Ticket: https://issues.apache.org/jira/browse/KAFKA-6042)
> (PR: https://github.com/apache/kafka/pull/4103)
> (in 1.1 
> https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/coordinator/group/GroupMetadataManager.scala#L292)
> # Followers do incremental fetch
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-227%3A+Introduce+Incremental+FetchRequests+to+Increase+Partition+Scalability
> h2. 3) Interaction between OffsetCommit, DelayedProduce and FetchFollower
> {quote}
> OffsetCommit append a message of committed offset to partition of topic 
> `__consumer_offsets`
> During the append, it would create a DelayedProduce with lock to 
> GroupMetadata.lock (ReentrantLock) and add to delayedProducePurgatory
> When follower fetches the partition of topic `__consumer_offsets` and causes 
> an increase in HighWaterMark, delayedProducePurgatory would be transversed 
> and all operations related to the partition may be completed
> {quote}
> *DelayedProduce from 

[jira] [Commented] (KAFKA-8523) InsertField transformation fails when encountering tombstone event

2019-06-11 Thread Gunnar Morling (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860794#comment-16860794
 ] 

Gunnar Morling commented on KAFKA-8523:
---

Filed https://github.com/apache/kafka/pull/6914.

> InsertField transformation fails when encountering tombstone event
> --
>
> Key: KAFKA-8523
> URL: https://issues.apache.org/jira/browse/KAFKA-8523
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Gunnar Morling
>Priority: Major
>
> When applying the {{InsertField}} transformation to a tombstone event, an 
> exception is raised:
> {code}
> org.apache.kafka.connect.errors.DataException: Only Map objects supported in 
> absence of schema for [field insertion], found: null
>   at 
> org.apache.kafka.connect.transforms.util.Requirements.requireMap(Requirements.java:38)
>   at 
> org.apache.kafka.connect.transforms.InsertField.applySchemaless(InsertField.java:138)
>   at 
> org.apache.kafka.connect.transforms.InsertField.apply(InsertField.java:131)
>   at 
> org.apache.kafka.connect.transforms.InsertFieldTest.tombstone(InsertFieldTest.java:128)
> {code}
> AFAICS, the transform can still be made working in in this case by simply 
> building up a new value map from scratch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8523) InsertField transformation fails when encountering tombstone event

2019-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860793#comment-16860793
 ] 

ASF GitHub Bot commented on KAFKA-8523:
---

gunnarmorling commented on pull request #6914: KAFKA-8523 Enabling InsertField 
transform to be used with tombstone events
URL: https://github.com/apache/kafka/pull/6914
 
 
   Fixes https://issues.apache.org/jira/browse/KAFKA-8523
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> InsertField transformation fails when encountering tombstone event
> --
>
> Key: KAFKA-8523
> URL: https://issues.apache.org/jira/browse/KAFKA-8523
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Gunnar Morling
>Priority: Major
>
> When applying the {{InsertField}} transformation to a tombstone event, an 
> exception is raised:
> {code}
> org.apache.kafka.connect.errors.DataException: Only Map objects supported in 
> absence of schema for [field insertion], found: null
>   at 
> org.apache.kafka.connect.transforms.util.Requirements.requireMap(Requirements.java:38)
>   at 
> org.apache.kafka.connect.transforms.InsertField.applySchemaless(InsertField.java:138)
>   at 
> org.apache.kafka.connect.transforms.InsertField.apply(InsertField.java:131)
>   at 
> org.apache.kafka.connect.transforms.InsertFieldTest.tombstone(InsertFieldTest.java:128)
> {code}
> AFAICS, the transform can still be made working in in this case by simply 
> building up a new value map from scratch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8523) InsertField transformation fails when encountering tombstone event

2019-06-11 Thread Gunnar Morling (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860688#comment-16860688
 ] 

Gunnar Morling commented on KAFKA-8523:
---

I'll provide a PR shortly.

> InsertField transformation fails when encountering tombstone event
> --
>
> Key: KAFKA-8523
> URL: https://issues.apache.org/jira/browse/KAFKA-8523
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Gunnar Morling
>Priority: Major
>
> When applying the {{InsertField}} transformation to a tombstone event, an 
> exception is raised:
> {code}
> org.apache.kafka.connect.errors.DataException: Only Map objects supported in 
> absence of schema for [field insertion], found: null
>   at 
> org.apache.kafka.connect.transforms.util.Requirements.requireMap(Requirements.java:38)
>   at 
> org.apache.kafka.connect.transforms.InsertField.applySchemaless(InsertField.java:138)
>   at 
> org.apache.kafka.connect.transforms.InsertField.apply(InsertField.java:131)
>   at 
> org.apache.kafka.connect.transforms.InsertFieldTest.tombstone(InsertFieldTest.java:128)
> {code}
> AFAICS, the transform can still be made working in in this case by simply 
> building up a new value map from scratch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8523) InsertField transformation fails when encountering tombstone event

2019-06-11 Thread Gunnar Morling (JIRA)
Gunnar Morling created KAFKA-8523:
-

 Summary: InsertField transformation fails when encountering 
tombstone event
 Key: KAFKA-8523
 URL: https://issues.apache.org/jira/browse/KAFKA-8523
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Gunnar Morling


When applying the {{InsertField}} transformation to a tombstone event, an 
exception is raised:

{code}
org.apache.kafka.connect.errors.DataException: Only Map objects supported in 
absence of schema for [field insertion], found: null
at 
org.apache.kafka.connect.transforms.util.Requirements.requireMap(Requirements.java:38)
at 
org.apache.kafka.connect.transforms.InsertField.applySchemaless(InsertField.java:138)
at 
org.apache.kafka.connect.transforms.InsertField.apply(InsertField.java:131)
at 
org.apache.kafka.connect.transforms.InsertFieldTest.tombstone(InsertFieldTest.java:128)
{code}

AFAICS, the transform can still be made working in in this case by simply 
building up a new value map from scratch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8503) AdminClient should ignore retries config if a custom timeout is provided

2019-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860642#comment-16860642
 ] 

ASF GitHub Bot commented on KAFKA-8503:
---

huxihx commented on pull request #6913: KAFKA-8503: Ignore retries config if a 
custom timeout is provided
URL: https://github.com/apache/kafka/pull/6913
 
 
   https://issues.apache.org/jira/browse/KAFKA-8503
   
   When custom timeout is provided, `retries` config could be ignored for 
individual APIs in KafkaAdminClient. Tweaked code path for `Call.fail` to pass 
NPathComplexity check.
   
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> AdminClient should ignore retries config if a custom timeout is provided
> 
>
> Key: KAFKA-8503
> URL: https://issues.apache.org/jira/browse/KAFKA-8503
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: huxihx
>Priority: Major
>
> The admin client takes a `retries` config similar to the producer. The 
> default value is 5. Individual APIs also accept an optional timeout, which is 
> defaulted to `request.timeout.ms`. The call will fail if either `retries` or 
> the API timeout is exceeded. This is not very intuitive. I think a user would 
> expect to wait if they provided a timeout and the operation cannot be 
> completed. In general, timeouts are much easier for users to work with and 
> reason about.
> A couple options are either to ignore `retries` in this case or to increase 
> the default value of `retries` to something large and not likely to be 
> exceeded. I propose to do the first. Longer term, we could consider 
> deprecating `retries` and avoiding the overloading of `request.timeout.ms` by 
> providing a `default.api.timeout.ms` similar to the consumer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-8503) AdminClient should ignore retries config if a custom timeout is provided

2019-06-11 Thread huxihx (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huxihx reassigned KAFKA-8503:
-

Assignee: huxihx

> AdminClient should ignore retries config if a custom timeout is provided
> 
>
> Key: KAFKA-8503
> URL: https://issues.apache.org/jira/browse/KAFKA-8503
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: huxihx
>Priority: Major
>
> The admin client takes a `retries` config similar to the producer. The 
> default value is 5. Individual APIs also accept an optional timeout, which is 
> defaulted to `request.timeout.ms`. The call will fail if either `retries` or 
> the API timeout is exceeded. This is not very intuitive. I think a user would 
> expect to wait if they provided a timeout and the operation cannot be 
> completed. In general, timeouts are much easier for users to work with and 
> reason about.
> A couple options are either to ignore `retries` in this case or to increase 
> the default value of `retries` to something large and not likely to be 
> exceeded. I propose to do the first. Longer term, we could consider 
> deprecating `retries` and avoiding the overloading of `request.timeout.ms` by 
> providing a `default.api.timeout.ms` similar to the consumer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8507) Support --bootstrap-server in all command line tools

2019-06-11 Thread Lee Dongjin (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860611#comment-16860611
 ] 

Lee Dongjin commented on KAFKA-8507:


I reviewed the tools in `bin` and found that the out-of-date tools are fixed in 
the PRs following:

1. https://github.com/apache/kafka/pull/3453 (KAFKA-5532)
 * ProducerPerformance

2. https://github.com/apache/kafka/pull/2161 (KAFKA-4307)
 * VerifiableConsumer
 * VerifiableProducer

3. https://github.com/apache/kafka/pull/3605 (KAFKA-2111)
 * ConsumerGroupCommand
 * ReassignPartitionsCommand
 * ConsoleConsumer
 * StreamResetter

Are there any tools I omitted?

> Support --bootstrap-server in all command line tools
> 
>
> Key: KAFKA-8507
> URL: https://issues.apache.org/jira/browse/KAFKA-8507
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Jason Gustafson
>Priority: Major
>  Labels: needs-kip
>
> This is a unambitious initial move toward standardizing the command line 
> tools. We have favored the name {{\-\-bootstrap-server}} in all new tools 
> since it matches the config {{bootstrap.server}} which is used by all 
> clients. Some older commands use {{\-\-broker-list}} or 
> {{\-\-bootstrap-servers}} and maybe other exotic variations. We should 
> support {{\-\-bootstrap-server}} in all commands and deprecate the other 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8193) Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore

2019-06-11 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8193:
---
Fix Version/s: (was: 2.4.0)

> Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore
> ---
>
> Key: KAFKA-8193
> URL: https://issues.apache.org/jira/browse/KAFKA-8193
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 2.3.0
>Reporter: Konstantine Karantasis
>Assignee: Guozhang Wang
>Priority: Major
>  Labels: flaky-test
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3576/console]
>  *14:14:48* org.apache.kafka.streams.integration.MetricsIntegrationTest > 
> testStreamMetricOfWindowStore STARTED
> *14:14:59* 
> org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetricOfWindowStore
>  failed, log available in 
> /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk11-scala2.12/streams/build/reports/testOutput/org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetricOfWindowStore.test.stdout
> *14:14:59* 
> *14:14:59* org.apache.kafka.streams.integration.MetricsIntegrationTest > 
> testStreamMetricOfWindowStore FAILED
> *14:14:59* java.lang.AssertionError: Condition not met within timeout 1. 
> testStoreMetricWindow -> Size of metrics of type:'put-latency-avg' must be 
> equal to:2 but it's equal to 0 expected:<2> but was:<0>
> *14:14:59* at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:361)
> *14:14:59* at 
> org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetricOfWindowStore(MetricsIntegrationTest.java:260)
> *14:15:01*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8263) Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore

2019-06-11 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860599#comment-16860599
 ] 

Matthias J. Sax commented on KAFKA-8263:


[https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3714/tests]

> Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore
> ---
>
> Key: KAFKA-8263
> URL: https://issues.apache.org/jira/browse/KAFKA-8263
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Assignee: Bruno Cadonna
>Priority: Major
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3900/testReport/junit/org.apache.kafka.streams.integration/MetricsIntegrationTest/testStreamMetricOfWindowStore/]
> {quote}java.lang.AssertionError: Condition not met within timeout 1. 
> testStoreMetricWindow -> Size of metrics of type:'put-latency-avg' must be 
> equal to:2 but it's equal to 0 expected:<2> but was:<0> at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:361){quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8418) Connect System tests are not waiting for REST resources to be registered

2019-06-11 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8418:
---
Fix Version/s: (was: 2.3)

> Connect System tests are not waiting for REST resources to be registered
> 
>
> Key: KAFKA-8418
> URL: https://issues.apache.org/jira/browse/KAFKA-8418
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.2.0
>Reporter: Oleksandr Diachenko
>Assignee: Oleksandr Diachenko
>Priority: Blocker
> Fix For: 1.0.3, 1.1.2, 2.0.2, 2.3.0, 2.1.2, 2.2.1
>
>
> I am getting an error while running Kafka tests:
> {code}
> Traceback (most recent call last): File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.5-py2.7.egg/ducktape/tests/runner_client.py",
>  line 132, in run data = self.run_test() File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.5-py2.7.egg/ducktape/tests/runner_client.py",
>  line 189, in run_test return self.test_context.function(self.test) File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/tests/kafkatest/tests/connect/connect_rest_test.py",
>  line 89, in test_rest_api assert set([connector_plugin['class'] for 
> connector_plugin in self.cc.list_connector_plugins()]) == 
> \{self.FILE_SOURCE_CONNECTOR, self.FILE_SINK_CONNECTOR} File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/tests/kafkatest/services/connect.py",
>  line 218, in list_connector_plugins return self._rest('/connector-plugins/', 
> node=node) File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/tests/kafkatest/services/connect.py",
>  line 234, in _rest raise ConnectRestError(resp.status_code, resp.text, 
> resp.url) ConnectRestError
> {code}
> From the logs, I see two messages:
> {code}
> [2019-05-23 16:09:59,373] INFO REST server listening at 
> http://172.31.39.205:8083/, advertising URL http://worker1:8083/ 
> (org.apache.kafka.connect.runtime.rest.RestServer)
> {code}
> and {code}
> [2019-05-23 16:10:00,738] INFO REST resources initialized; server is started 
> and ready to handle requests 
> (org.apache.kafka.connect.runtime.rest.RestServer)
> {code}
>  it takes 1365 ms to actually load REST resources, but the test is waiting on 
> a port to be listening. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8265) Connect Client Config Override policy

2019-06-11 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8265:
---
Fix Version/s: (was: 2.3)
   2.3.0

> Connect Client Config Override policy
> -
>
> Key: KAFKA-8265
> URL: https://issues.apache.org/jira/browse/KAFKA-8265
> Project: Kafka
>  Issue Type: New Feature
>  Components: KafkaConnect
>Reporter: Magesh kumar Nandakumar
>Assignee: Magesh kumar Nandakumar
>Priority: Major
> Fix For: 2.3.0
>
>
> Right now, each source connector and sink connector inherit their client 
> configurations from the worker properties. Within the worker properties, all 
> configurations that have a prefix of "producer." or "consumer." are applied 
> to all source connectors and sink connectors respectively.
> We should allow the  "producer." or "consumer." to be overridden in 
> accordance to an override policy determined by the administrator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8473) Adjust Connect system tests for incremental cooperative rebalancing and enable them for both eager and incremental cooperative rebalancing

2019-06-11 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8473:
---
Affects Version/s: (was: 2.3)
   2.3.0

> Adjust Connect system tests for incremental cooperative rebalancing and 
> enable them for both eager and incremental cooperative rebalancing
> --
>
> Key: KAFKA-8473
> URL: https://issues.apache.org/jira/browse/KAFKA-8473
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.3.0
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Critical
> Fix For: 2.3.0
>
>
>  
> {{connect.protocol=compatible}} that enables incremental cooperative 
> rebalancing if all workers are in that version is now the default option. 
> System tests should be parameterized to keep running the for eager 
> rebalancing protocol as well to make sure no regression have happened. 
> Also, for the incremental cooperative protocol, a few tests need to be 
> adjusted to have a lower rebalancing delay (the delay that is applied to wait 
> for returning workers) 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8475) Temporarily restore SslFactory.sslContext() helper

2019-06-11 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8475:
---
Fix Version/s: (was: 2.3)
   2.3.0

> Temporarily restore SslFactory.sslContext() helper
> --
>
> Key: KAFKA-8475
> URL: https://issues.apache.org/jira/browse/KAFKA-8475
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin P. McCabe
>Assignee: Randall Hauch
>Priority: Blocker
> Fix For: 2.3.0
>
>
> Temporarily restore the SslFactory.sslContext() function, which some 
> connectors use.  This function is not a public API and it will be removed 
> eventually.  For now, we will mark it as deprecated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8473) Adjust Connect system tests for incremental cooperative rebalancing and enable them for both eager and incremental cooperative rebalancing

2019-06-11 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8473:
---
Fix Version/s: (was: 2.3)
   2.3.0

> Adjust Connect system tests for incremental cooperative rebalancing and 
> enable them for both eager and incremental cooperative rebalancing
> --
>
> Key: KAFKA-8473
> URL: https://issues.apache.org/jira/browse/KAFKA-8473
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.3
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Critical
> Fix For: 2.3.0
>
>
>  
> {{connect.protocol=compatible}} that enables incremental cooperative 
> rebalancing if all workers are in that version is now the default option. 
> System tests should be parameterized to keep running the for eager 
> rebalancing protocol as well to make sure no regression have happened. 
> Also, for the incremental cooperative protocol, a few tests need to be 
> adjusted to have a lower rebalancing delay (the delay that is applied to wait 
> for returning workers) 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7988) Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize

2019-06-11 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860597#comment-16860597
 ] 

Matthias J. Sax commented on KAFKA-7988:


Failed in 2.3: 
[https://builds.apache.org/blue/organizations/jenkins/kafka-2.3-jdk8/detail/kafka-2.3-jdk8/47/tests]
{code:java}
java.lang.AssertionError: expected:<{0=10, 1=11, 2=12, 3=13, 4=4, 5=5, 6=6, 
7=7, 8=8, 9=9}> but was:<{0=10, 1=11, 2=12, 3=13, 4=14, 5=5, 6=6, 7=7, 8=8, 
9=9}>
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.failNotEquals(Assert.java:835)
at org.junit.Assert.assertEquals(Assert.java:120)
at org.junit.Assert.assertEquals(Assert.java:146)
at 
kafka.server.DynamicBrokerReconfigurationTest.stopAndVerifyProduceConsume(DynamicBrokerReconfigurationTest.scala:1353)
at 
kafka.server.DynamicBrokerReconfigurationTest.verifyThreadPoolResize$1(DynamicBrokerReconfigurationTest.scala:615)
at 
kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize(DynamicBrokerReconfigurationTest.scala:629)
{code}

> Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize
> 
>
> Key: KAFKA-7988
> URL: https://issues.apache.org/jira/browse/KAFKA-7988
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Assignee: Rajini Sivaram
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> To get stable nightly builds for `2.2` release, I create tickets for all 
> observed test failures.
> [https://builds.apache.org/blue/organizations/jenkins/kafka-2.2-jdk8/detail/kafka-2.2-jdk8/30/]
> {quote}kafka.server.DynamicBrokerReconfigurationTest > testThreadPoolResize 
> FAILED java.lang.AssertionError: Invalid threads: expected 6, got 5: 
> List(ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-1, 
> ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-2, ReplicaFetcherThread-0-1) 
> at org.junit.Assert.fail(Assert.java:88) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> kafka.server.DynamicBrokerReconfigurationTest.verifyThreads(DynamicBrokerReconfigurationTest.scala:1260)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.maybeVerifyThreadPoolSize$1(DynamicBrokerReconfigurationTest.scala:531)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.resizeThreadPool$1(DynamicBrokerReconfigurationTest.scala:550)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.reducePoolSize$1(DynamicBrokerReconfigurationTest.scala:536)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.$anonfun$testThreadPoolResize$3(DynamicBrokerReconfigurationTest.scala:559)
>  at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) at 
> kafka.server.DynamicBrokerReconfigurationTest.verifyThreadPoolResize$1(DynamicBrokerReconfigurationTest.scala:558)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize(DynamicBrokerReconfigurationTest.scala:572){quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7988) Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize

2019-06-11 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-7988:
---
Affects Version/s: 2.3.0

> Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize
> 
>
> Key: KAFKA-7988
> URL: https://issues.apache.org/jira/browse/KAFKA-7988
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Matthias J. Sax
>Assignee: Rajini Sivaram
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> To get stable nightly builds for `2.2` release, I create tickets for all 
> observed test failures.
> [https://builds.apache.org/blue/organizations/jenkins/kafka-2.2-jdk8/detail/kafka-2.2-jdk8/30/]
> {quote}kafka.server.DynamicBrokerReconfigurationTest > testThreadPoolResize 
> FAILED java.lang.AssertionError: Invalid threads: expected 6, got 5: 
> List(ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-1, 
> ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-2, ReplicaFetcherThread-0-1) 
> at org.junit.Assert.fail(Assert.java:88) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> kafka.server.DynamicBrokerReconfigurationTest.verifyThreads(DynamicBrokerReconfigurationTest.scala:1260)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.maybeVerifyThreadPoolSize$1(DynamicBrokerReconfigurationTest.scala:531)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.resizeThreadPool$1(DynamicBrokerReconfigurationTest.scala:550)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.reducePoolSize$1(DynamicBrokerReconfigurationTest.scala:536)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.$anonfun$testThreadPoolResize$3(DynamicBrokerReconfigurationTest.scala:559)
>  at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) at 
> kafka.server.DynamicBrokerReconfigurationTest.verifyThreadPoolResize$1(DynamicBrokerReconfigurationTest.scala:558)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize(DynamicBrokerReconfigurationTest.scala:572){quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8078) Flaky Test TableTableJoinIntegrationTest#testInnerInner

2019-06-11 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860596#comment-16860596
 ] 

Matthias J. Sax commented on KAFKA-8078:


Failed again with timeout: 
[https://builds.apache.org/blue/organizations/jenkins/kafka-2.3-jdk8/detail/kafka-2.3-jdk8/47/tests]

testLeftOuter, caching enabled

> Flaky Test TableTableJoinIntegrationTest#testInnerInner
> ---
>
> Key: KAFKA-8078
> URL: https://issues.apache.org/jira/browse/KAFKA-8078
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Priority: Major
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3445/tests]
> {quote}java.lang.AssertionError: Condition not met within timeout 15000. 
> Never received expected final result.
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:365)
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:325)
> at 
> org.apache.kafka.streams.integration.AbstractJoinIntegrationTest.runTest(AbstractJoinIntegrationTest.java:246)
> at 
> org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testInnerInner(TableTableJoinIntegrationTest.java:196){quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8041) Flaky Test LogDirFailureTest#testIOExceptionDuringLogRoll

2019-06-11 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860594#comment-16860594
 ] 

Matthias J. Sax commented on KAFKA-8041:


Failed again: 
[https://builds.apache.org/blue/organizations/jenkins/kafka-2.0-jdk8/detail/kafka-2.0-jdk8/275/tests]

> Flaky Test LogDirFailureTest#testIOExceptionDuringLogRoll
> -
>
> Key: KAFKA-8041
> URL: https://issues.apache.org/jira/browse/KAFKA-8041
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.0.1, 2.3.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-2.0-jdk8/detail/kafka-2.0-jdk8/236/tests]
> {quote}java.lang.AssertionError: Expected some messages
> at kafka.utils.TestUtils$.fail(TestUtils.scala:357)
> at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:787)
> at 
> kafka.server.LogDirFailureTest.testProduceAfterLogDirFailureOnLeader(LogDirFailureTest.scala:189)
> at 
> kafka.server.LogDirFailureTest.testIOExceptionDuringLogRoll(LogDirFailureTest.scala:63){quote}
> STDOUT
> {quote}[2019-03-05 03:44:58,614] ERROR [ReplicaFetcher replicaId=1, 
> leaderId=0, fetcherId=0] Error for partition topic-6 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,614] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-10 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-4 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-8 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-2 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:45:00,248] ERROR Error while rolling log segment for topic-0 
> in dir 
> /home/jenkins/jenkins-slave/workspace/kafka-2.0-jdk8/core/data/kafka-3869208920357262216
>  (kafka.server.LogDirFailureChannel:76)
> java.io.FileNotFoundException: 
> /home/jenkins/jenkins-slave/workspace/kafka-2.0-jdk8/core/data/kafka-3869208920357262216/topic-0/.index
>  (Not a directory)
> at java.io.RandomAccessFile.open0(Native Method)
> at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
> at java.io.RandomAccessFile.(RandomAccessFile.java:243)
> at kafka.log.AbstractIndex.$anonfun$resize$1(AbstractIndex.scala:121)
> at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:12)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251)
> at kafka.log.AbstractIndex.resize(AbstractIndex.scala:115)
> at kafka.log.AbstractIndex.$anonfun$trimToValidSize$1(AbstractIndex.scala:184)
> at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:12)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251)
> at kafka.log.AbstractIndex.trimToValidSize(AbstractIndex.scala:184)
> at kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:501)
> at kafka.log.Log.$anonfun$roll$8(Log.scala:1520)
> at kafka.log.Log.$anonfun$roll$8$adapted(Log.scala:1520)
> at scala.Option.foreach(Option.scala:257)
> at kafka.log.Log.$anonfun$roll$2(Log.scala:1520)
> at kafka.log.Log.maybeHandleIOException(Log.scala:1881)
> at kafka.log.Log.roll(Log.scala:1484)
> at 
> kafka.server.LogDirFailureTest.testProduceAfterLogDirFailureOnLeader(LogDirFailureTest.scala:154)
> at 
> kafka.server.LogDirFailureTest.testIOExceptionDuringLogRoll(LogDirFailureTest.scala:63)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> 

[jira] [Comment Edited] (KAFKA-8516) Consider allowing all replicas to have read/write permissions

2019-06-11 Thread Richard Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860591#comment-16860591
 ] 

Richard Yu edited comment on KAFKA-8516 at 6/11/19 6:16 AM:


Well, this is when we start straying into an area called "consensus 
algorithms". Kafka's current leader-replica model loosely follows an algorithm 
referred to as Raft (research paper here: [https://raft.github.io/raft.pdf] ). 
If we wish to implement the write permissions part (which looks like a pretty 
big if), then we would perhaps have to consider something along the lines of 
EPaxos ( [https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf] ).

cc [~hachikuji] your thoughts on this?

Edit: If implemented correctly, there should be a pretty good performance gain 
(results in RAFT paper anyways seem to indicate this).


was (Author: yohan123):
Well, this is when we start straying into an area called "consensus 
algorithms". Kafka's current leader-replica model closely follows an algorithm 
referred to as Raft (research paper here: [https://raft.github.io/raft.pdf] ). 
If we wish to implement the write permissions part (which looks like a pretty 
big if), then we would perhaps have to consider something along the lines of 
EPaxos ( [https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf] ).

cc [~hachikuji] your thoughts on this?

Edit: If implemented correctly, there should be a pretty good performance gain 
(results in RAFT paper anyways seem to indicate this).

> Consider allowing all replicas to have read/write permissions
> -
>
> Key: KAFKA-8516
> URL: https://issues.apache.org/jira/browse/KAFKA-8516
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Richard Yu
>Priority: Major
>
> Currently, in Kafka internals, a leader is responsible for all the read and 
> write operations requested by the user. This naturally incurs a bottleneck 
> since one replica, as the leader, would experience a significantly heavier 
> workload than other replicas and also means that all client commands must 
> pass through a chokepoint. If a leader fails, all processing effectively 
> comes to a halt until another leader election. In order to help solve this 
> problem, we could think about redesigning Kafka core so that any replica is 
> able to do read and write operations as well. That is, the system be changed 
> so that _all_ replicas have read/write permissions.
>   
>  This has multiple positives. Notably the following:
>  * Workload can be more evenly distributed since leader replicas are weighted 
> more than follower replicas (in this new design, all replicas are equal)
>  * Some failures would not be as catastrophic as in the leader-follower 
> paradigm. There is no one single "leader". If one replica goes down, others 
> are still able to read/write as needed. Processing could continue without 
> interruption.
> The implementation for such a change like this will be very extensive and 
> discussion would be needed to decide if such an improvement as described 
> above would warrant such a drastic redesign of Kafka internals.
> Relevant KIP for read permissions can be found here:
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-8516) Consider allowing all replicas to have read/write permissions

2019-06-11 Thread Richard Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860591#comment-16860591
 ] 

Richard Yu edited comment on KAFKA-8516 at 6/11/19 6:15 AM:


Well, this is when we start straying into an area called "consensus 
algorithms". Kafka's current leader-replica model closely follows an algorithm 
referred to as Raft (research paper here: [https://raft.github.io/raft.pdf] ). 
If we wish to implement the write permissions part (which looks like a pretty 
big if), then we would perhaps have to consider something along the lines of 
EPaxos ( [https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf] ).

cc [~hachikuji] your thoughts on this?

Edit: If implemented correctly, there should be a pretty good performance gain 
(results in RAFT paper anyways seem to indicate this).


was (Author: yohan123):
Well, this is when we start straying into an area called "consensus 
algorithms". Kafka's current leader-replica model closely follows an algorithm 
referred to as Raft (research paper here: [https://raft.github.io/raft.pdf] ). 
If we wish to implement the write permissions part (which looks like a pretty 
big if), then we would perhaps have to consider something along the lines of 
EPaxos ( [https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf] ). 

cc [~hachikuji] your thoughts on this?

> Consider allowing all replicas to have read/write permissions
> -
>
> Key: KAFKA-8516
> URL: https://issues.apache.org/jira/browse/KAFKA-8516
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Richard Yu
>Priority: Major
>
> Currently, in Kafka internals, a leader is responsible for all the read and 
> write operations requested by the user. This naturally incurs a bottleneck 
> since one replica, as the leader, would experience a significantly heavier 
> workload than other replicas and also means that all client commands must 
> pass through a chokepoint. If a leader fails, all processing effectively 
> comes to a halt until another leader election. In order to help solve this 
> problem, we could think about redesigning Kafka core so that any replica is 
> able to do read and write operations as well. That is, the system be changed 
> so that _all_ replicas have read/write permissions.
>   
>  This has multiple positives. Notably the following:
>  * Workload can be more evenly distributed since leader replicas are weighted 
> more than follower replicas (in this new design, all replicas are equal)
>  * Some failures would not be as catastrophic as in the leader-follower 
> paradigm. There is no one single "leader". If one replica goes down, others 
> are still able to read/write as needed. Processing could continue without 
> interruption.
> The implementation for such a change like this will be very extensive and 
> discussion would be needed to decide if such an improvement as described 
> above would warrant such a drastic redesign of Kafka internals.
> Relevant KIP for read permissions can be found here:
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8516) Consider allowing all replicas to have read/write permissions

2019-06-11 Thread Richard Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860591#comment-16860591
 ] 

Richard Yu commented on KAFKA-8516:
---

Well, this is when we start straying into an area called "consensus 
algorithms". Kafka's current leader-replica model closely follows an algorithm 
referred to as Raft (research paper here: [https://raft.github.io/raft.pdf] ). 
If we wish to implement the write permissions part (which looks like a pretty 
big if), then we would perhaps have to consider something along the lines of 
EPaxos ( [https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf] ). 

cc [~hachikuji] your thoughts on this?

> Consider allowing all replicas to have read/write permissions
> -
>
> Key: KAFKA-8516
> URL: https://issues.apache.org/jira/browse/KAFKA-8516
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Richard Yu
>Priority: Major
>
> Currently, in Kafka internals, a leader is responsible for all the read and 
> write operations requested by the user. This naturally incurs a bottleneck 
> since one replica, as the leader, would experience a significantly heavier 
> workload than other replicas and also means that all client commands must 
> pass through a chokepoint. If a leader fails, all processing effectively 
> comes to a halt until another leader election. In order to help solve this 
> problem, we could think about redesigning Kafka core so that any replica is 
> able to do read and write operations as well. That is, the system be changed 
> so that _all_ replicas have read/write permissions.
>   
>  This has multiple positives. Notably the following:
>  * Workload can be more evenly distributed since leader replicas are weighted 
> more than follower replicas (in this new design, all replicas are equal)
>  * Some failures would not be as catastrophic as in the leader-follower 
> paradigm. There is no one single "leader". If one replica goes down, others 
> are still able to read/write as needed. Processing could continue without 
> interruption.
> The implementation for such a change like this will be very extensive and 
> discussion would be needed to decide if such an improvement as described 
> above would warrant such a drastic redesign of Kafka internals.
> Relevant KIP for read permissions can be found here:
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8450) Augment processed in MockProcessor as KeyValueAndTimestamp

2019-06-11 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860589#comment-16860589
 ] 

Matthias J. Sax commented on KAFKA-8450:


Yes. Instead of calling `makeRecord` that returns a String, the idea is to use 
`KeyValueTimestamp` instead.

> Augment processed in MockProcessor as KeyValueAndTimestamp
> --
>
> Key: KAFKA-8450
> URL: https://issues.apache.org/jira/browse/KAFKA-8450
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, unit tests
>Reporter: Guozhang Wang
>Assignee: SuryaTeja Duggi
>Priority: Major
>  Labels: newbie
>
> Today the book-keeping list of `processed` records in MockProcessor is in the 
> form of String, in which we just call the key / value type's toString 
> function in order to book-keep, this loses the type information as well as 
> not having timestamp associated with it.
> It's better to translate its type to `KeyValueAndTimestamp` and refactor 
> impacted unit tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8513) Add kafka-streams-application-reset.bat for Windows platform

2019-06-11 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860588#comment-16860588
 ] 

Matthias J. Sax commented on KAFKA-8513:


Ack. Fine with me. The PR looks good in general.

> Add kafka-streams-application-reset.bat for Windows platform
> 
>
> Key: KAFKA-8513
> URL: https://issues.apache.org/jira/browse/KAFKA-8513
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, tools
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Minor
>
> For improving Windows support, it'd be nice if there were a batch file 
> corresponding to bin/kafka-streams-application-reset.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8516) Consider allowing all replicas to have read/write permissions

2019-06-11 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860587#comment-16860587
 ] 

Matthias J. Sax commented on KAFKA-8516:


Agreed.

In fact, I am not even sure if we can (or want) to allow writing to different 
replicas at all. Solving the consistency problem is very, very(!) hard, and 
might not be possible without a major performance hit. Hence, I tend to think 
that it will never be implemented.

> Consider allowing all replicas to have read/write permissions
> -
>
> Key: KAFKA-8516
> URL: https://issues.apache.org/jira/browse/KAFKA-8516
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Richard Yu
>Priority: Major
>
> Currently, in Kafka internals, a leader is responsible for all the read and 
> write operations requested by the user. This naturally incurs a bottleneck 
> since one replica, as the leader, would experience a significantly heavier 
> workload than other replicas and also means that all client commands must 
> pass through a chokepoint. If a leader fails, all processing effectively 
> comes to a halt until another leader election. In order to help solve this 
> problem, we could think about redesigning Kafka core so that any replica is 
> able to do read and write operations as well. That is, the system be changed 
> so that _all_ replicas have read/write permissions.
>   
>  This has multiple positives. Notably the following:
>  * Workload can be more evenly distributed since leader replicas are weighted 
> more than follower replicas (in this new design, all replicas are equal)
>  * Some failures would not be as catastrophic as in the leader-follower 
> paradigm. There is no one single "leader". If one replica goes down, others 
> are still able to read/write as needed. Processing could continue without 
> interruption.
> The implementation for such a change like this will be very extensive and 
> discussion would be needed to decide if such an improvement as described 
> above would warrant such a drastic redesign of Kafka internals.
> Relevant KIP for read permissions can be found here:
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)