[jira] [Assigned] (KAFKA-7487) DumpLogSegments reports mismatches for indexed offsets which are not at the start of a record batch

2018-10-05 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reassigned KAFKA-7487:
--

Assignee: Ismael Juma

> DumpLogSegments reports mismatches for indexed offsets which are not at the 
> start of a record batch
> ---
>
> Key: KAFKA-7487
> URL: https://issues.apache.org/jira/browse/KAFKA-7487
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Michael Bingham
>Assignee: Ismael Juma
>Priority: Minor
>  Labels: newbie
>
> When running {{DumpLogSegments}} against an {{.index file}}, mismatches may 
> be reported when the indexed message offset is not the first record in a 
> batch. For example:
> {code}
>  Mismatches in 
> :/var/lib/kafka/data/replicated-topic-0/.index
>  Index offset: 968, log offset: 966
> {code}
> And looking at the corresponding {{.log}} file:
> {code}
> baseOffset: 966 lastOffset: 968 count: 3 baseSequence: -1 lastSequence: -1 
> producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 0 isTransactional: 
> false position: 3952771 CreateTime: 1538768639065 isvalid: true size: 12166 
> magic: 2 compresscodec: NONE crc: 294402254 
> {code}
> In this case, the last offset in the batch was indexed instead of the first, 
> but the index has to map physical position to the start of the batch, leading 
> to the mismatch.
> It seems like {{DumpLogSegments}} should not report these cases as 
> mismatches, which users might interpret as an error or problem



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7415) OffsetsForLeaderEpoch may incorrectly respond with undefined epoch causing truncation to HW

2018-10-05 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-7415:
---
Fix Version/s: 1.1.2

> OffsetsForLeaderEpoch may incorrectly respond with undefined epoch causing 
> truncation to HW
> ---
>
> Key: KAFKA-7415
> URL: https://issues.apache.org/jira/browse/KAFKA-7415
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 2.0.0
>Reporter: Anna Povzner
>Assignee: Jason Gustafson
>Priority: Major
> Fix For: 1.1.2, 2.0.1, 2.1.0
>
>
> If the follower's last appended epoch is ahead of the leader's last appended 
> epoch, the OffsetsForLeaderEpoch response will incorrectly send 
> (UNDEFINED_EPOCH, UNDEFINED_EPOCH_OFFSET), and the follower will truncate to 
> HW. This may lead to data loss in some rare cases where 2 back-to-back leader 
> elections happen (failure of one leader, followed by quick re-election of the 
> next leader due to preferred leader election, so that all replicas are still 
> in the ISR, and then failure of the 3rd leader).
> The bug is in LeaderEpochFileCache.endOffsetFor(), which returns 
> (UNDEFINED_EPOCH, UNDEFINED_EPOCH_OFFSET) if the requested leader epoch is 
> ahead of the last leader epoch in the cache. The method should return (last 
> leader epoch in the cache, LEO) in this scenario.
> We don't create an entry in a leader epoch cache until a message is appended 
> with the new leader epoch. Every append to log calls 
> LeaderEpochFileCache.assign(). However, it would be much cleaner if 
> `makeLeader` created an entry in the cache as soon as replica becomes a 
> leader, which will fix the bug. In case the leader never appends any 
> messages, and the next leader epoch starts with the same offset, we already 
> have clearAndFlushLatest() that clears entries with start offsets greater or 
> equal to the passed offset. LeaderEpochFileCache.assign() could be merged 
> with clearAndFlushLatest(), so that we clear cache entries with offsets equal 
> or greater than the start offset of the new epoch, so that we do not need to 
> call these methods separately. 
>  
> Here is an example of a scenario where the issue leads to the data loss.
> Suppose we have three replicas: r1, r2, and r3. Initially, the ISR consists 
> of (r1, r2, r3) and the leader is r1. The data up to offset 10 has been 
> committed to the ISR. Here is the initial state:
> {code:java}
> Leader: r1
> leader epoch: 0
> ISR(r1, r2, r3)
> r1: [hw=10, leo=10]
> r2: [hw=8, leo=10]
> r3: [hw=5, leo=10]
> {code}
> Replica 1 fails and leaves the ISR, which makes Replica 2 the new leader with 
> leader epoch = 1. The leader appends a batch, but it is not replicated yet to 
> the followers.
> {code:java}
> Leader: r2
> leader epoch: 1
> ISR(r2, r3)
> r1: [hw=10, leo=10]
> r2: [hw=8, leo=11]
> r3: [hw=5, leo=10]
> {code}
> Replica 3 is elected a leader (due to preferred leader election) before it 
> has a chance to truncate, with leader epoch 2. 
> {code:java}
> Leader: r3
> leader epoch: 2
> ISR(r2, r3)
> r1: [hw=10, leo=10]
> r2: [hw=8, leo=11]
> r3: [hw=5, leo=10]
> {code}
> Replica 2 sends OffsetsForLeaderEpoch(leader epoch = 1) to Replica 3. Replica 
> 3 incorrectly replies with UNDEFINED_EPOCH_OFFSET, and Replica 2 truncates to 
> HW. If Replica 3 fails before Replica 2 re-fetches the data, this may lead to 
> data loss.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7415) OffsetsForLeaderEpoch may incorrectly respond with undefined epoch causing truncation to HW

2018-10-05 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640391#comment-16640391
 ] 

ASF GitHub Bot commented on KAFKA-7415:
---

hachikuji closed pull request #5749: KAFKA-7415; Persist leader epoch and start 
offset on becoming a leader
URL: https://github.com/apache/kafka/pull/5749
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/core/src/main/scala/kafka/cluster/Partition.scala 
b/core/src/main/scala/kafka/cluster/Partition.scala
index e3a8186094f..ac0de9dba1d 100755
--- a/core/src/main/scala/kafka/cluster/Partition.scala
+++ b/core/src/main/scala/kafka/cluster/Partition.scala
@@ -280,8 +280,17 @@ class Partition(val topic: String,
   leaderEpoch = partitionStateInfo.basePartitionState.leaderEpoch
   leaderEpochStartOffsetOpt = Some(leaderEpochStartOffset)
   zkVersion = partitionStateInfo.basePartitionState.zkVersion
-  val isNewLeader = leaderReplicaIdOpt.map(_ != 
localBrokerId).getOrElse(true)
 
+  // In the case of successive leader elections in a short time period, a 
follower may have
+  // entries in its log from a later epoch than any entry in the new 
leader's log. In order
+  // to ensure that these followers can truncate to the right offset, we 
must cache the new
+  // leader epoch and the start offset since it should be larger than any 
epoch that a follower
+  // would try to query.
+  leaderReplica.epochs.foreach { epochCache =>
+epochCache.assign(leaderEpoch, leaderEpochStartOffset)
+  }
+
+  val isNewLeader = !leaderReplicaIdOpt.contains(localBrokerId)
   val curLeaderLogEndOffset = leaderReplica.logEndOffset.messageOffset
   val curTimeMs = time.milliseconds
   // initialize lastCaughtUpTime of replicas as well as their 
lastFetchTimeMs and lastFetchLeaderLogEndOffset.
diff --git a/core/src/main/scala/kafka/cluster/Replica.scala 
b/core/src/main/scala/kafka/cluster/Replica.scala
index 4b65e439e2c..462f1f3cc23 100644
--- a/core/src/main/scala/kafka/cluster/Replica.scala
+++ b/core/src/main/scala/kafka/cluster/Replica.scala
@@ -18,6 +18,7 @@
 package kafka.cluster
 
 import kafka.log.Log
+import kafka.server.epoch.LeaderEpochFileCache
 import kafka.utils.Logging
 import kafka.server.{LogOffsetMetadata, LogReadResult}
 import kafka.common.KafkaException
@@ -55,7 +56,7 @@ class Replica(val brokerId: Int,
 
   def lastCaughtUpTimeMs = _lastCaughtUpTimeMs
 
-  val epochs = log.map(_.leaderEpochCache)
+  val epochs: Option[LeaderEpochFileCache] = log.map(_.leaderEpochCache)
 
   info(s"Replica loaded for partition $topicPartition with initial high 
watermark $initialHighWatermarkValue")
   log.foreach(_.onHighWatermarkIncremented(initialHighWatermarkValue))
diff --git a/core/src/main/scala/kafka/log/Log.scala 
b/core/src/main/scala/kafka/log/Log.scala
index 9b423ba5933..eeb569a0771 100644
--- a/core/src/main/scala/kafka/log/Log.scala
+++ b/core/src/main/scala/kafka/log/Log.scala
@@ -39,7 +39,7 @@ import com.yammer.metrics.core.Gauge
 import org.apache.kafka.common.utils.{Time, Utils}
 import kafka.message.{BrokerCompressionCodec, CompressionCodec, 
NoCompressionCodec}
 import kafka.server.checkpoints.{LeaderEpochCheckpointFile, LeaderEpochFile}
-import kafka.server.epoch.{LeaderEpochCache, LeaderEpochFileCache}
+import kafka.server.epoch.LeaderEpochFileCache
 import org.apache.kafka.common.TopicPartition
 import org.apache.kafka.common.requests.FetchResponse.AbortedTransaction
 import java.util.Map.{Entry => JEntry}
@@ -208,7 +208,7 @@ class Log(@volatile var dir: File,
   /* the actual segments of the log */
   private val segments: ConcurrentNavigableMap[java.lang.Long, LogSegment] = 
new ConcurrentSkipListMap[java.lang.Long, LogSegment]
 
-  @volatile private var _leaderEpochCache: LeaderEpochCache = 
initializeLeaderEpochCache()
+  @volatile private var _leaderEpochCache: LeaderEpochFileCache = 
initializeLeaderEpochCache()
 
   locally {
 val startMs = time.milliseconds
@@ -218,12 +218,12 @@ class Log(@volatile var dir: File,
 /* Calculate the offset of the next message */
 nextOffsetMetadata = new LogOffsetMetadata(nextOffset, 
activeSegment.baseOffset, activeSegment.size)
 
-_leaderEpochCache.clearAndFlushLatest(nextOffsetMetadata.messageOffset)
+_leaderEpochCache.truncateFromEnd(nextOffsetMetadata.messageOffset)
 
 logStartOffset = math.max(logStartOffset, 
segments.firstEntry.getValue.baseOffset)
 
 // The earliest leader epoch may not be flushed during a hard failure. 
Recover it here.
-_leaderEpochCache.clearAndFlushEarliest(logStartOffset)
+_leaderEpochCache.truncateFromStart(logStartOffset)
 
 loadProducerState(logEndOffset, 

[jira] [Updated] (KAFKA-7488) Controller not recovering after disconnection to zookeeper

2018-10-05 Thread Luigi Tagliamonte (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luigi Tagliamonte updated KAFKA-7488:
-
Description: 
This issue seems related to https://issues.apache.org/jira/browse/KAFKA-2729 
that has been resolved here https://issues.apache.org/jira/browse/KAFKA-2729

The issue still exists in Kafka 1.1

Cluster details:
 * 3 Kafka nodes cluster running 1.1
 * 3 Zookeeper node cluster running 3.4.10

Today meanwhile I was replacing a zookeeper server 
([10.48.208.70|http://10.48.208.70/]) the leader among the brokers experienced 
this issue:
{code:java}
[2018-10-05 21:03:02,799] INFO [GroupMetadataManager brokerId=1] Removed 0 
expired offsets in 0 milliseconds. 
(kafka.coordinator.group.GroupMetadataManager)
[2018-10-05 21:08:20,060] INFO Unable to read additional data from server 
sessionid 0x34663b434985000e, likely server has closed socket, closing socket 
connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2018-10-05 21:08:21,001] INFO Opening socket connection to server 
10.48.208.70/10.48.208.70:2181. Will not attempt to authenticate using SASL 
(unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-10-05 21:08:21,003] WARN Session 0x34663b434985000e for server null, 
unexpected error, closing socket connection and attempting reconnect 
(org.apache.zookeeper.ClientCnxn)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
[2018-10-05 21:08:21,797] INFO Opening socket connection to server 
10.48.210.44/10.48.210.44:2181. Will not attempt to authenticate using SASL 
(unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-10-05 21:08:21,799] INFO Socket connection established to 
10.48.210.44/10.48.210.44:2181, initiating session 
(org.apache.zookeeper.ClientCnxn)
[2018-10-05 21:08:21,802] INFO Session establishment complete on server 
10.48.210.44/10.48.210.44:2181, sessionid = 0x34663b434985000e, negotiated 
timeout = 6000 (org.apache.zookeeper.ClientCnxn)
[2018-10-05 21:08:28,015] INFO Creating /controller (is it secure? false) 
(kafka.zk.KafkaZkClient)
[2018-10-05 21:08:28,015] INFO Creating /controller (is it secure? false) 
(kafka.zk.KafkaZkClient)
[2018-10-05 21:08:28,025] ERROR Error while creating ephemeral at /controller, 
node already exists and owner '3703712903740981258' does not match current 
session '3775770497779040270' (kafka.zk.KafkaZkClient$CheckedEphemeral)
[2018-10-05 21:08:28,025] ERROR Error while creating ephemeral at /controller, 
node already exists and owner '3703712903740981258' does not match current 
session '3775770497779040270' (kafka.zk.KafkaZkClient$CheckedEphemeral)
[2018-10-05 21:08:28,025] INFO Result of znode creation at /controller is: 
NODEEXISTS (kafka.zk.KafkaZkClient)
[2018-10-05 21:08:28,025] INFO Result of znode creation at /controller is: 
NODEEXISTS (kafka.zk.KafkaZkClient)
[2018-10-05 21:08:42,561] INFO [Partition -store-changelog-7 broker=1] 
Shrinking ISR from 2,1,3 to 1 (kafka.cluster.Partition)
[2018-10-05 21:08:42,561] INFO [Partition -store-changelog-7 broker=1] 
Shrinking ISR from 2,1,3 to 1 (kafka.cluster.Partition)
[2018-10-05 21:08:42,569] INFO [Partition -store-changelog-7 broker=1] 
Cached zkVersion [11] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition)
[2018-10-05 21:08:42,569] INFO [Partition -store-changelog-7 broker=1] 
Cached zkVersion [11] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition)
[2018-10-05 21:08:42,569] INFO [Partition bycontact_0-19 broker=1] 
Shrinking ISR from 2,1,3 to 1 (kafka.cluster.Partition)
[2018-10-05 21:08:42,569] INFO [Partition bycontact_0-19 broker=1] 
Shrinking ISR from 2,1,3 to 1 (kafka.cluster.Partition)
[2018-10-05 21:08:42,574] INFO [Partition bycontact_0-19 broker=1] 
Cached zkVersion [44] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition)
[2018-10-05 21:08:42,574] INFO [Partition bycontact_0-19 broker=1] 
Cached zkVersion [44] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition){code}
 The only way in order to recover was to restart the broker.

  was:
This issue seems related to https://issues.apache.org/jira/browse/KAFKA-2729 
that has been resolved here https://issues.apache.org/jira/browse/KAFKA-2729

The issue still exists in Kafka 1.1

Cluster details:
 * 3 Kafka nodes cluster running 1.1
 * 3 Zookeeper node cluster running 3.4.10

Today meanwhile I was replacing a zookeeper server the leader among the brokers 
experienced this issue:
{code:java}
[2018-10-05 21:03:02,799] INFO [GroupMetadataManager brokerId=1] Removed 0 
expired offsets in 0 milliseconds. 

[jira] [Created] (KAFKA-7488) Controller not recovering after disconnection to zookeeper

2018-10-05 Thread Luigi Tagliamonte (JIRA)
Luigi Tagliamonte created KAFKA-7488:


 Summary: Controller not recovering after disconnection to zookeeper
 Key: KAFKA-7488
 URL: https://issues.apache.org/jira/browse/KAFKA-7488
 Project: Kafka
  Issue Type: Bug
  Components: controller
Affects Versions: 1.1.0
Reporter: Luigi Tagliamonte


This issue seems related to https://issues.apache.org/jira/browse/KAFKA-2729 
that has been resolved here https://issues.apache.org/jira/browse/KAFKA-2729

The issue still exists in Kafka 1.1

Cluster details:
 * 3 Kafka nodes cluster running 1.1
 * 3 Zookeeper node cluster running 3.4.10

Today meanwhile I was replacing a zookeeper server the leader among the brokers 
experienced this issue:
{code:java}
[2018-10-05 21:03:02,799] INFO [GroupMetadataManager brokerId=1] Removed 0 
expired offsets in 0 milliseconds. 
(kafka.coordinator.group.GroupMetadataManager)
[2018-10-05 21:08:20,060] INFO Unable to read additional data from server 
sessionid 0x34663b434985000e, likely server has closed socket, closing socket 
connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2018-10-05 21:08:21,001] INFO Opening socket connection to server 
10.48.208.70/10.48.208.70:2181. Will not attempt to authenticate using SASL 
(unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-10-05 21:08:21,003] WARN Session 0x34663b434985000e for server null, 
unexpected error, closing socket connection and attempting reconnect 
(org.apache.zookeeper.ClientCnxn)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
[2018-10-05 21:08:21,797] INFO Opening socket connection to server 
10.48.210.44/10.48.210.44:2181. Will not attempt to authenticate using SASL 
(unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-10-05 21:08:21,799] INFO Socket connection established to 
10.48.210.44/10.48.210.44:2181, initiating session 
(org.apache.zookeeper.ClientCnxn)
[2018-10-05 21:08:21,802] INFO Session establishment complete on server 
10.48.210.44/10.48.210.44:2181, sessionid = 0x34663b434985000e, negotiated 
timeout = 6000 (org.apache.zookeeper.ClientCnxn)
[2018-10-05 21:08:28,015] INFO Creating /controller (is it secure? false) 
(kafka.zk.KafkaZkClient)
[2018-10-05 21:08:28,015] INFO Creating /controller (is it secure? false) 
(kafka.zk.KafkaZkClient)
[2018-10-05 21:08:28,025] ERROR Error while creating ephemeral at /controller, 
node already exists and owner '3703712903740981258' does not match current 
session '3775770497779040270' (kafka.zk.KafkaZkClient$CheckedEphemeral)
[2018-10-05 21:08:28,025] ERROR Error while creating ephemeral at /controller, 
node already exists and owner '3703712903740981258' does not match current 
session '3775770497779040270' (kafka.zk.KafkaZkClient$CheckedEphemeral)
[2018-10-05 21:08:28,025] INFO Result of znode creation at /controller is: 
NODEEXISTS (kafka.zk.KafkaZkClient)
[2018-10-05 21:08:28,025] INFO Result of znode creation at /controller is: 
NODEEXISTS (kafka.zk.KafkaZkClient)
[2018-10-05 21:08:42,561] INFO [Partition -store-changelog-7 broker=1] 
Shrinking ISR from 2,1,3 to 1 (kafka.cluster.Partition)
[2018-10-05 21:08:42,561] INFO [Partition -store-changelog-7 broker=1] 
Shrinking ISR from 2,1,3 to 1 (kafka.cluster.Partition)
[2018-10-05 21:08:42,569] INFO [Partition -store-changelog-7 broker=1] 
Cached zkVersion [11] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition)
[2018-10-05 21:08:42,569] INFO [Partition -store-changelog-7 broker=1] 
Cached zkVersion [11] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition)
[2018-10-05 21:08:42,569] INFO [Partition bycontact_0-19 broker=1] 
Shrinking ISR from 2,1,3 to 1 (kafka.cluster.Partition)
[2018-10-05 21:08:42,569] INFO [Partition bycontact_0-19 broker=1] 
Shrinking ISR from 2,1,3 to 1 (kafka.cluster.Partition)
[2018-10-05 21:08:42,574] INFO [Partition bycontact_0-19 broker=1] 
Cached zkVersion [44] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition)
[2018-10-05 21:08:42,574] INFO [Partition bycontact_0-19 broker=1] 
Cached zkVersion [44] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition){code}
 The only way in order to recover was to restart the broker.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-5200) If a replicated topic is deleted with one broker down, it can't be recreated

2018-10-05 Thread Adam Elliott (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640384#comment-16640384
 ] 

Adam Elliott commented on KAFKA-5200:
-

My team runs a multi-tenant Kafka cluster with a lot of diverse uses, and one 
of the services we provide is an API for managed topic creation/deletion. The 
cluster is large (> 100 nodes) and so it's pretty likely that, for whatever 
reason, at least one node will be down at any given point--and sometimes for 
extended periods.
 
We're currently struggling with the behaviour described above. From what I can 
see in the source, this is intentional behaviour. We don't have control over 
when clients choose to delete topics, so we can't reasonably block deletions 
for reasons that they would see as arbitrary ("some backend server is down, try 
again later").
 
The open source partition reassignment tool _does_ work, as of the version 
we're using at least, to move replicas off of dead brokers, but only if the 
topic hasn't already been deleted. If it has, the only remedy is manual surgery 
to Zookeeper state and bouncing the controller.
 
There's one additional factor which makes this bug worse: if too many topics 
are "half-deleted" at once, the controller crashes/becomes unresponsive; at 
which point a minor annoyance for one of our customers becomes something much 
more serious.
 
I've had a look at the various deletion related state machines and I don't see 
an easy fix. I also haven't seen much mention or discussion of this problem 
apart from this issue.

> If a replicated topic is deleted with one broker down, it can't be recreated
> 
>
> Key: KAFKA-5200
> URL: https://issues.apache.org/jira/browse/KAFKA-5200
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Edoardo Comar
>Priority: Major
>
> In a cluster with 5 broker, replication factor=3, min in sync=2,
> one broker went down 
> A user's app remained of course unaware of that and deleted a topic that 
> (unknowingly) had a replica on the dead broker.
> The topic went in 'pending delete' mode
> The user then tried to recreate the topic - which failed, so his app was left 
> stuck - no working topic and no ability to create one.
> The reassignment tool fails to move the replica out of the dead broker - 
> specifically because the broker with the partition replica to move is dead :-)
> Incidentally the confluent-rebalancer docs say
> http://docs.confluent.io/current/kafka/post-deployment.html#scaling-the-cluster
> > Supports moving partitions away from dead brokers
> It'd be nice to similarly improve the opensource reassignment tool



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-2729) Cached zkVersion not equal to that in zookeeper, broker not recovering.

2018-10-05 Thread Luigi Tagliamonte (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640382#comment-16640382
 ] 

Luigi Tagliamonte commented on KAFKA-2729:
--

This issue seems not fixed in 1.1

Cluster details:
 * 3 Kafka nodes cluster running 1.1
 * 3 Zookeeper node cluster running 3.4.10

Today meanwhile I was replacing a zookeeper server the leader among the brokers 
experienced this issue:
{code:java}
[2018-10-05 21:03:02,799] INFO [GroupMetadataManager brokerId=1] Removed 0 
expired offsets in 0 milliseconds. 
(kafka.coordinator.group.GroupMetadataManager)
[2018-10-05 21:08:20,060] INFO Unable to read additional data from server 
sessionid 0x34663b434985000e, likely server has closed socket, closing socket 
connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2018-10-05 21:08:21,001] INFO Opening socket connection to server 
10.48.208.70/10.48.208.70:2181. Will not attempt to authenticate using SASL 
(unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-10-05 21:08:21,003] WARN Session 0x34663b434985000e for server null, 
unexpected error, closing socket connection and attempting reconnect 
(org.apache.zookeeper.ClientCnxn)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
[2018-10-05 21:08:21,797] INFO Opening socket connection to server 
10.48.210.44/10.48.210.44:2181. Will not attempt to authenticate using SASL 
(unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-10-05 21:08:21,799] INFO Socket connection established to 
10.48.210.44/10.48.210.44:2181, initiating session 
(org.apache.zookeeper.ClientCnxn)
[2018-10-05 21:08:21,802] INFO Session establishment complete on server 
10.48.210.44/10.48.210.44:2181, sessionid = 0x34663b434985000e, negotiated 
timeout = 6000 (org.apache.zookeeper.ClientCnxn)
[2018-10-05 21:08:28,015] INFO Creating /controller (is it secure? false) 
(kafka.zk.KafkaZkClient)
[2018-10-05 21:08:28,015] INFO Creating /controller (is it secure? false) 
(kafka.zk.KafkaZkClient)
[2018-10-05 21:08:28,025] ERROR Error while creating ephemeral at /controller, 
node already exists and owner '3703712903740981258' does not match current 
session '3775770497779040270' (kafka.zk.KafkaZkClient$CheckedEphemeral)
[2018-10-05 21:08:28,025] ERROR Error while creating ephemeral at /controller, 
node already exists and owner '3703712903740981258' does not match current 
session '3775770497779040270' (kafka.zk.KafkaZkClient$CheckedEphemeral)
[2018-10-05 21:08:28,025] INFO Result of znode creation at /controller is: 
NODEEXISTS (kafka.zk.KafkaZkClient)
[2018-10-05 21:08:28,025] INFO Result of znode creation at /controller is: 
NODEEXISTS (kafka.zk.KafkaZkClient)
[2018-10-05 21:08:42,561] INFO [Partition -store-changelog-7 broker=1] 
Shrinking ISR from 2,1,3 to 1 (kafka.cluster.Partition)
[2018-10-05 21:08:42,561] INFO [Partition -store-changelog-7 broker=1] 
Shrinking ISR from 2,1,3 to 1 (kafka.cluster.Partition)
[2018-10-05 21:08:42,569] INFO [Partition -store-changelog-7 broker=1] 
Cached zkVersion [11] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition)
[2018-10-05 21:08:42,569] INFO [Partition -store-changelog-7 broker=1] 
Cached zkVersion [11] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition)
[2018-10-05 21:08:42,569] INFO [Partition bycontact_0-19 broker=1] 
Shrinking ISR from 2,1,3 to 1 (kafka.cluster.Partition)
[2018-10-05 21:08:42,569] INFO [Partition bycontact_0-19 broker=1] 
Shrinking ISR from 2,1,3 to 1 (kafka.cluster.Partition)
[2018-10-05 21:08:42,574] INFO [Partition bycontact_0-19 broker=1] 
Cached zkVersion [44] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition)
[2018-10-05 21:08:42,574] INFO [Partition bycontact_0-19 broker=1] 
Cached zkVersion [44] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition){code}
 The only way in order to recover was to restart the broker.

> Cached zkVersion not equal to that in zookeeper, broker not recovering.
> ---
>
> Key: KAFKA-2729
> URL: https://issues.apache.org/jira/browse/KAFKA-2729
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1, 0.9.0.0, 0.10.0.0, 0.10.1.0, 0.11.0.0
>Reporter: Danil Serdyuchenko
>Assignee: Onur Karaman
>Priority: Major
> Fix For: 1.1.0
>
>
> After a small network wobble where zookeeper nodes couldn't reach each other, 
> we started seeing a large number of undereplicated partitions. The zookeeper 
> cluster recovered, however we continued to see a 

[jira] [Commented] (KAFKA-7484) Fix test SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()

2018-10-05 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640351#comment-16640351
 ] 

ASF GitHub Bot commented on KAFKA-7484:
---

mjsax closed pull request #5748: KAFKA-7484: fix suppression integration tests
URL: https://github.com/apache/kafka/pull/5748
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/streams/src/test/java/org/apache/kafka/streams/integration/SuppressionDurabilityIntegrationTest.java
 
b/streams/src/test/java/org/apache/kafka/streams/integration/SuppressionDurabilityIntegrationTest.java
index 93ecc53a7fc..c26b52f5a0b 100644
--- 
a/streams/src/test/java/org/apache/kafka/streams/integration/SuppressionDurabilityIntegrationTest.java
+++ 
b/streams/src/test/java/org/apache/kafka/streams/integration/SuppressionDurabilityIntegrationTest.java
@@ -75,7 +75,11 @@
 @Category({IntegrationTest.class})
 public class SuppressionDurabilityIntegrationTest {
 @ClassRule
-public static final EmbeddedKafkaCluster CLUSTER = new 
EmbeddedKafkaCluster(3);
+public static final EmbeddedKafkaCluster CLUSTER = new 
EmbeddedKafkaCluster(
+3,
+mkProperties(mkMap()),
+0L
+);
 private static final StringDeserializer STRING_DESERIALIZER = new 
StringDeserializer();
 private static final StringSerializer STRING_SERIALIZER = new 
StringSerializer();
 private static final Serde STRING_SERDE = Serdes.String();
diff --git 
a/streams/src/test/java/org/apache/kafka/streams/integration/SuppressionIntegrationTest.java
 
b/streams/src/test/java/org/apache/kafka/streams/integration/SuppressionIntegrationTest.java
index 94bc0570b0f..ee32a1dc371 100644
--- 
a/streams/src/test/java/org/apache/kafka/streams/integration/SuppressionIntegrationTest.java
+++ 
b/streams/src/test/java/org/apache/kafka/streams/integration/SuppressionIntegrationTest.java
@@ -75,7 +75,11 @@
 @Category({IntegrationTest.class})
 public class SuppressionIntegrationTest {
 @ClassRule
-public static final EmbeddedKafkaCluster CLUSTER = new 
EmbeddedKafkaCluster(1);
+public static final EmbeddedKafkaCluster CLUSTER = new 
EmbeddedKafkaCluster(
+1,
+mkProperties(mkMap()),
+0L
+);
 private static final StringDeserializer STRING_DESERIALIZER = new 
StringDeserializer();
 private static final StringSerializer STRING_SERIALIZER = new 
StringSerializer();
 private static final Serde STRING_SERDE = Serdes.String();


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix test 
> SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()
> 
>
> Key: KAFKA-7484
> URL: https://issues.apache.org/jira/browse/KAFKA-7484
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.1.0
>Reporter: Dong Lin
>Assignee: John Roesler
>Priority: Major
>
> The test 
> SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown() fails 
> in the 2.1.0 branch Jekin job. See 
> [https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/.|https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/]
> Here is the stack trace: 
> java.lang.AssertionError: Condition not met within timeout 3. Did not 
> receive all 3 records from topic output-raw-shouldRecoverBufferAfterShutdown 
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:278) at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:462)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:343)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.verifyKeyValueTimestamps(IntegrationTestUtils.java:543)
>  at 
> org.apache.kafka.streams.integration.SuppressionDurabilityIntegrationTest.verifyOutput(SuppressionDurabilityIntegrationTest.java:239)
>  at 
> 

[jira] [Resolved] (KAFKA-6880) Zombie replicas must be fenced

2018-10-05 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-6880.

   Resolution: Fixed
Fix Version/s: (was: 2.2.0)
   2.1.0

This was fixed in KAFKA-7395

> Zombie replicas must be fenced
> --
>
> Key: KAFKA-6880
> URL: https://issues.apache.org/jira/browse/KAFKA-6880
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Major
>  Labels: needs-kip
> Fix For: 2.1.0
>
>
> Let's say we have three replicas for a partition: 1, 2 ,and 3.
> In epoch 0, broker 1 is the leader and writes up to offset 50. Broker 2 
> replicates up to offset 50, but broker 3 is a little behind at offset 40. The 
> high watermark is 40. 
> Suppose that broker 2 has a zk session expiration event, but fails to detect 
> it or fails to reestablish a session (e.g. due to a bug like KAFKA-6879), and 
> it continues fetching from broker 1.
> For whatever reason, broker 3 is elected the leader for epoch 1 beginning at 
> offset 40. Broker 1 detects the leader change and truncates its log to offset 
> 40. Some new data is appended up to offset 60, which is fully replicated to 
> broker 1. Broker 2 continues fetching from broker 1 at offset 50, but gets 
> NOT_LEADER_FOR_PARTITION errors, which is retriable and hence broker 2 will 
> retry.
> After some time, broker 1 becomes the leader again for epoch 2. Broker 1 
> begins at offset 60. Broker 2 has not exhausted retries and is now able to 
> fetch at offset 50 and append the last 10 records in order to catch up. 
> However, because it did not observed the leader changes, it never saw the 
> need to truncate its log. Hence offsets 40-49 still reflect the uncommitted 
> changes from epoch 0. Neither KIP-101 nor KIP-279 can fix this because the 
> tail of the log is correct.
> The basic problem is that zombie replicas are not fenced properly by the 
> leader epoch. We actually observed a sequence roughly like this after a 
> broker had partially deadlocked from KAFKA-6879. We should add the leader 
> epoch to fetch requests so that the leader can fence the zombie replicas.
> A related problem is that we currently allow such zombie replicas to be added 
> to the ISR even if they are in an offline state. The problem is that the 
> controller will never elect them, so being part of the ISR does not give the 
> availability guarantee that is intended. This would also be fixed by 
> verifying replica leader epoch in fetch requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7395) Add fencing to replication protocol (KIP-320)

2018-10-05 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-7395.

   Resolution: Fixed
Fix Version/s: 2.1.0

> Add fencing to replication protocol (KIP-320)
> -
>
> Key: KAFKA-7395
> URL: https://issues.apache.org/jira/browse/KAFKA-7395
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Major
> Fix For: 2.1.0
>
>
> This patch implements the broker-side changes to support fencing improvements 
> from KIP-320: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-320%3A+Allow+fetchers+to+detect+and+handle+log+truncation



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7487) DumpLogSegments reports mismatches for indexed offsets which are not at the start of a record batch

2018-10-05 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-7487:
---
Labels: newbie  (was: )

> DumpLogSegments reports mismatches for indexed offsets which are not at the 
> start of a record batch
> ---
>
> Key: KAFKA-7487
> URL: https://issues.apache.org/jira/browse/KAFKA-7487
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Michael Bingham
>Priority: Minor
>  Labels: newbie
>
> When running {{DumpLogSegments}} against an {{.index file}}, mismatches may 
> be reported when the indexed message offset is not the first record in a 
> batch. For example:
> {code}
>  Mismatches in 
> :/var/lib/kafka/data/replicated-topic-0/.index
>  Index offset: 968, log offset: 966
> {code}
> And looking at the corresponding {{.log}} file:
> {code}
> baseOffset: 966 lastOffset: 968 count: 3 baseSequence: -1 lastSequence: -1 
> producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 0 isTransactional: 
> false position: 3952771 CreateTime: 1538768639065 isvalid: true size: 12166 
> magic: 2 compresscodec: NONE crc: 294402254 
> {code}
> In this case, the last offset in the batch was indexed instead of the first, 
> but the index has to map physical position to the start of the batch, leading 
> to the mismatch.
> It seems like {{DumpLogSegments}} should not report these cases as 
> mismatches, which users might interpret as an error or problem



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7487) DumpLogSegments reports mismatches for indexed offsets which are not at the start of a record batch

2018-10-05 Thread Michael Bingham (JIRA)
Michael Bingham created KAFKA-7487:
--

 Summary: DumpLogSegments reports mismatches for indexed offsets 
which are not at the start of a record batch
 Key: KAFKA-7487
 URL: https://issues.apache.org/jira/browse/KAFKA-7487
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 2.0.0
Reporter: Michael Bingham


When running {{DumpLogSegments}} against an {{.index file}}, mismatches may be 
reported when the indexed message offset is not the first record in a batch. 
For example:

{code}
 Mismatches in 
:/var/lib/kafka/data/replicated-topic-0/.index
 Index offset: 968, log offset: 966
{code}

And looking at the corresponding {{.log}} file:

{code}
baseOffset: 966 lastOffset: 968 count: 3 baseSequence: -1 lastSequence: -1 
producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 0 isTransactional: false 
position: 3952771 CreateTime: 1538768639065 isvalid: true size: 12166 magic: 2 
compresscodec: NONE crc: 294402254 
{code}

In this case, the last offset in the batch was indexed instead of the first, 
but the index has to map physical position to the start of the batch, leading 
to the mismatch.

It seems like {{DumpLogSegments}} should not report these cases as mismatches, 
which users might interpret as an error or problem



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7483) Streams should allow headers to be passed to Serializer

2018-10-05 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640304#comment-16640304
 ] 

ASF GitHub Bot commented on KAFKA-7483:
---

kamalcph opened a new pull request #5751: KAFKA-7483: Allow streams to pass 
headers through Serializer.
URL: https://github.com/apache/kafka/pull/5751
 
 
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Streams should allow headers to be passed to Serializer
> ---
>
> Key: KAFKA-7483
> URL: https://issues.apache.org/jira/browse/KAFKA-7483
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Kamal Chandraprakash
>Assignee: Kamal Chandraprakash
>Priority: Major
>
> We are storing schema metadata for record key and value in the header. 
> Serializer, includes this metadata in the record header. While doing simple 
> record transformation (x transformed to y) in streams, the same header that 
> was passed from source, pushed to the sink topic. This leads to error while 
> reading the sink topic.
> We should call the overloaded `serialize(topic, headers, object)` method in 
> [RecordCollectorImpl|https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordCollectorImpl.java#L156]
>  which in-turn adds the correct metadata in the record header.
> With this sink topic reader have the option to read all the values for a 
> header key using `Headers#headers`  [or] only the overwritten value using 
> `Headers#lastHeader`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6914) Kafka Connect - Plugins class should have a constructor that can take in parent ClassLoader

2018-10-05 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-6914.

   Resolution: Fixed
Fix Version/s: 2.1.0
   2.0.1
   1.1.2
   1.0.3
   0.11.0.4

> Kafka Connect - Plugins class should have a constructor that can take in 
> parent ClassLoader
> ---
>
> Key: KAFKA-6914
> URL: https://issues.apache.org/jira/browse/KAFKA-6914
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Sriram KS
>Assignee: Konstantine Karantasis
>Priority: Minor
> Fix For: 0.11.0.4, 1.0.3, 1.1.2, 2.0.1, 2.1.0
>
>
> Currently Plugins class has a single constructor that takes in map of props.
> Please make Plugin class to have a constructor that takes in a classLoader as 
> well and use it to set DelegationClassLoader's parent classLoader.
> Reason:
> This will be useful if i am already having a managed class Loader environment 
> like a Spring boot app which resolves my class dependencies using my 
> maven/gradle dependency management.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6914) Kafka Connect - Plugins class should have a constructor that can take in parent ClassLoader

2018-10-05 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640272#comment-16640272
 ] 

ASF GitHub Bot commented on KAFKA-6914:
---

hachikuji closed pull request #5720: KAFKA-6914: Set parent classloader of 
DelegatingClassLoader same as the worker's
URL: https://github.com/apache/kafka/pull/5720
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/DelegatingClassLoader.java
 
b/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/DelegatingClassLoader.java
index 144dbd87f55..6104dd4426a 100644
--- 
a/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/DelegatingClassLoader.java
+++ 
b/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/DelegatingClassLoader.java
@@ -94,7 +94,11 @@ public DelegatingClassLoader(List pluginPaths, 
ClassLoader parent) {
 }
 
 public DelegatingClassLoader(List pluginPaths) {
-this(pluginPaths, ClassLoader.getSystemClassLoader());
+// Use as parent the classloader that loaded this class. In most cases 
this will be the
+// System classloader. But this choice here provides additional 
flexibility in managed
+// environments that control classloading differently (OSGi, Spring 
and others) and don't
+// depend on the System classloader to load Connect's classes.
+this(pluginPaths, DelegatingClassLoader.class.getClassLoader());
 }
 
 public Set> connectors() {


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Kafka Connect - Plugins class should have a constructor that can take in 
> parent ClassLoader
> ---
>
> Key: KAFKA-6914
> URL: https://issues.apache.org/jira/browse/KAFKA-6914
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Sriram KS
>Assignee: Konstantine Karantasis
>Priority: Minor
>
> Currently Plugins class has a single constructor that takes in map of props.
> Please make Plugin class to have a constructor that takes in a classLoader as 
> well and use it to set DelegationClassLoader's parent classLoader.
> Reason:
> This will be useful if i am already having a managed class Loader environment 
> like a Spring boot app which resolves my class dependencies using my 
> maven/gradle dependency management.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7467) NoSuchElementException is raised because controlBatch is empty

2018-10-05 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-7467.

   Resolution: Fixed
Fix Version/s: 2.1.0
   2.0.1
   1.1.2
   1.0.3

> NoSuchElementException is raised because controlBatch is empty
> --
>
> Key: KAFKA-7467
> URL: https://issues.apache.org/jira/browse/KAFKA-7467
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Badai Aqrandista
>Assignee: Bob Barrett
>Priority: Major
> Fix For: 1.0.3, 1.1.2, 2.0.1, 2.1.0
>
>
> Somehow, log cleaner died because of NoSuchElementException when it calls 
> onControlBatchRead:
> {noformat}
> [2018-09-25 14:18:31,088] INFO Cleaner 0: Cleaning segment 0 in log 
> __consumer_offsets-45 (largest timestamp Fri Apr 27 16:12:39 CDT 2018) into 
> 0, discarding deletes. (kafka.log.LogCleaner)
> [2018-09-25 14:18:31,092] ERROR [kafka-log-cleaner-thread-0]: Error due to 
> (kafka.log.LogCleaner)
> java.util.NoSuchElementException
>   at java.util.Collections$EmptyIterator.next(Collections.java:4189)
>   at 
> kafka.log.CleanedTransactionMetadata.onControlBatchRead(LogCleaner.scala:945)
>   at 
> kafka.log.Cleaner.kafka$log$Cleaner$$shouldDiscardBatch(LogCleaner.scala:636)
>   at kafka.log.Cleaner$$anon$5.checkBatchRetention(LogCleaner.scala:573)
>   at 
> org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:157)
>   at 
> org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:138)
>   at kafka.log.Cleaner.cleanInto(LogCleaner.scala:604)
>   at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:518)
>   at kafka.log.Cleaner$$anonfun$doClean$4.apply(LogCleaner.scala:462)
>   at kafka.log.Cleaner$$anonfun$doClean$4.apply(LogCleaner.scala:461)
>   at scala.collection.immutable.List.foreach(List.scala:392)
>   at kafka.log.Cleaner.doClean(LogCleaner.scala:461)
>   at kafka.log.Cleaner.clean(LogCleaner.scala:438)
>   at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:305)
>   at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:291)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
> [2018-09-25 14:18:31,093] INFO [kafka-log-cleaner-thread-0]: Stopped 
> (kafka.log.LogCleaner)
> {noformat}
> The following code does not seem to expect the controlBatch to be empty:
> [https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/LogCleaner.scala#L946]
> {noformat}
>   def onControlBatchRead(controlBatch: RecordBatch): Boolean = {
> consumeAbortedTxnsUpTo(controlBatch.lastOffset)
> val controlRecord = controlBatch.iterator.next()
> val controlType = ControlRecordType.parse(controlRecord.key)
> val producerId = controlBatch.producerId
> {noformat}
> MemoryRecords.filterTo copies the original control attribute for empty 
> batches, which results in empty control batches. Trying to read the control 
> type of an empty batch causes the error.
> {noformat}
>   else if (batchRetention == BatchRetention.RETAIN_EMPTY) {
> if (batchMagic < RecordBatch.MAGIC_VALUE_V2)
> throw new IllegalStateException("Empty batches are only supported for 
> magic v2 and above");
> 
> bufferOutputStream.ensureRemaining(DefaultRecordBatch.RECORD_BATCH_OVERHEAD);
> DefaultRecordBatch.writeEmptyHeader(bufferOutputStream.buffer(), 
> batchMagic, batch.producerId(),
> batch.producerEpoch(), batch.baseSequence(), batch.baseOffset(), 
> batch.lastOffset(),
> batch.partitionLeaderEpoch(), batch.timestampType(), 
> batch.maxTimestamp(),
> batch.isTransactional(), batch.isControlBatch());
> filterResult.updateRetainedBatchMetadata(batch, 0, true);
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6990) CommitFailedException; this task may be no longer owned by the thread

2018-10-05 Thread Ryanne Dolan (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryanne Dolan resolved KAFKA-6990.
-
Resolution: Not A Bug

> CommitFailedException; this task may be no longer owned by the thread
> -
>
> Key: KAFKA-6990
> URL: https://issues.apache.org/jira/browse/KAFKA-6990
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.1
>Reporter: Eugen Feller
>Priority: Blocker
>
> We are seeing a lot of CommitFailedExceptions on one of our Kafka stream 
> apps. Running Kafka Streams 0.11.0.1 and Kafka Broker 0.10.2.1. Full error 
> message:
> {code:java}
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] INFO 
> org.apache.kafka.streams.KafkaStreams - stream-client 
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d] State transition from 
> REBALANCING to RUNNING.
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.StreamTask - task [0_0] Failed 
> offset commits 
> {sightings_sighting_byclientmac_0-0=OffsetAndMetadata{offset=11, 
> metadata=''}, 
> mactocontactmappings_contactmapping_byclientmac_0-0=OffsetAndMetadata{offset=26,
>  metadata=''}} due to CommitFailedException
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.AssignedTasks - stream-thread 
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] Failed to commit 
> stream task 0_0 during commit state due to CommitFailedException; this task 
> may be no longer owned by the thread
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.AssignedTasks - stream-thread 
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] Failed to commit 
> stream task 0_1 during commit state due to CommitFailedException; this task 
> may be no longer owned by the thread
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.StreamTask - task [0_1] Failed 
> offset commits 
> {mactocontactmappings_contactmapping_byclientmac_0-1=OffsetAndMetadata{offset=38,
>  metadata=''}, sightings_sighting_byclientmac_0-1=OffsetAndMetadata{offset=1, 
> metadata=''}} due to CommitFailedException
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.StreamTask - task [0_2] Failed 
> offset commits 
> {mactocontactmappings_contactmapping_byclientmac_0-2=OffsetAndMetadata{offset=8,
>  metadata=''}, 
> sightings_sighting_byclientmac_0-2=OffsetAndMetadata{offset=24, metadata=''}} 
> due to CommitFailedException
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.AssignedTasks - stream-thread 
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] Failed to commit 
> stream task 0_2 during commit state due to CommitFailedException; this task 
> may be no longer owned by the thread
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.StreamTask - task [0_3] Failed 
> offset commits 
> {sightings_sighting_byclientmac_0-3=OffsetAndMetadata{offset=21, 
> metadata=''}, 
> mactocontactmappings_contactmapping_byclientmac_0-3=OffsetAndMetadata{offset=102,
>  metadata=''}} due to CommitFailedException
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.AssignedTasks - stream-thread 
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] Failed to commit 
> stream task 0_3 during commit state due to CommitFailedException; this task 
> may be no longer owned by the thread
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.StreamTask - task [0_4] Failed 
> offset commits 
> {sightings_sighting_byclientmac_0-4=OffsetAndMetadata{offset=5, metadata=''}, 
> mactocontactmappings_contactmapping_byclientmac_0-4=OffsetAndMetadata{offset=20,
>  metadata=''}} due to CommitFailedException
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.AssignedTasks - stream-thread 
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] Failed to commit 
> stream task 0_4 during commit state due to CommitFailedException; this task 
> may be no longer owned by the thread
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.StreamTask - task [0_5] Failed 
> offset commits 
> {mactocontactmappings_contactmapping_byclientmac_0-5=OffsetAndMetadata{offset=26,
>  metadata=''}, 
> 

[jira] [Commented] (KAFKA-6990) CommitFailedException; this task may be no longer owned by the thread

2018-10-05 Thread Ryanne Dolan (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640256#comment-16640256
 ] 

Ryanne Dolan commented on KAFKA-6990:
-

Usually this is due to records being processed slowly and sequentially, which 
seems to be the case here also. Consider processing records asynchronously / in 
parallel if you can, so that you can process the batch of records faster. If 
you can't do that, e.g. order matters, then reduce max.poll.records to 
something small s.t. poll() returns only a few records for you to process at a 
time.

> CommitFailedException; this task may be no longer owned by the thread
> -
>
> Key: KAFKA-6990
> URL: https://issues.apache.org/jira/browse/KAFKA-6990
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.1
>Reporter: Eugen Feller
>Priority: Blocker
>
> We are seeing a lot of CommitFailedExceptions on one of our Kafka stream 
> apps. Running Kafka Streams 0.11.0.1 and Kafka Broker 0.10.2.1. Full error 
> message:
> {code:java}
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] INFO 
> org.apache.kafka.streams.KafkaStreams - stream-client 
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d] State transition from 
> REBALANCING to RUNNING.
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.StreamTask - task [0_0] Failed 
> offset commits 
> {sightings_sighting_byclientmac_0-0=OffsetAndMetadata{offset=11, 
> metadata=''}, 
> mactocontactmappings_contactmapping_byclientmac_0-0=OffsetAndMetadata{offset=26,
>  metadata=''}} due to CommitFailedException
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.AssignedTasks - stream-thread 
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] Failed to commit 
> stream task 0_0 during commit state due to CommitFailedException; this task 
> may be no longer owned by the thread
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.AssignedTasks - stream-thread 
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] Failed to commit 
> stream task 0_1 during commit state due to CommitFailedException; this task 
> may be no longer owned by the thread
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.StreamTask - task [0_1] Failed 
> offset commits 
> {mactocontactmappings_contactmapping_byclientmac_0-1=OffsetAndMetadata{offset=38,
>  metadata=''}, sightings_sighting_byclientmac_0-1=OffsetAndMetadata{offset=1, 
> metadata=''}} due to CommitFailedException
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.StreamTask - task [0_2] Failed 
> offset commits 
> {mactocontactmappings_contactmapping_byclientmac_0-2=OffsetAndMetadata{offset=8,
>  metadata=''}, 
> sightings_sighting_byclientmac_0-2=OffsetAndMetadata{offset=24, metadata=''}} 
> due to CommitFailedException
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.AssignedTasks - stream-thread 
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] Failed to commit 
> stream task 0_2 during commit state due to CommitFailedException; this task 
> may be no longer owned by the thread
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.StreamTask - task [0_3] Failed 
> offset commits 
> {sightings_sighting_byclientmac_0-3=OffsetAndMetadata{offset=21, 
> metadata=''}, 
> mactocontactmappings_contactmapping_byclientmac_0-3=OffsetAndMetadata{offset=102,
>  metadata=''}} due to CommitFailedException
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.AssignedTasks - stream-thread 
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] Failed to commit 
> stream task 0_3 during commit state due to CommitFailedException; this task 
> may be no longer owned by the thread
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.StreamTask - task [0_4] Failed 
> offset commits 
> {sightings_sighting_byclientmac_0-4=OffsetAndMetadata{offset=5, metadata=''}, 
> mactocontactmappings_contactmapping_byclientmac_0-4=OffsetAndMetadata{offset=20,
>  metadata=''}} due to CommitFailedException
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] WARN 
> org.apache.kafka.streams.processor.internals.AssignedTasks - stream-thread 
> [visits-bdb209f1-c9dd-404b-9233-064f2fb4227d-StreamThread-1] Failed to commit 
> stream task 0_4 

[jira] [Commented] (KAFKA-7467) NoSuchElementException is raised because controlBatch is empty

2018-10-05 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640218#comment-16640218
 ] 

ASF GitHub Bot commented on KAFKA-7467:
---

hachikuji closed pull request #5727: KAFKA-7467: NoSuchElementException is 
raised because controlBatch is …
URL: https://github.com/apache/kafka/pull/5727
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
 
b/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
index 2a6ac0cfec7..6b42d07cc3f 100644
--- 
a/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
+++ 
b/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
@@ -1300,8 +1300,7 @@ private boolean containsAbortMarker(RecordBatch batch) {
 
 Iterator batchIterator = batch.iterator();
 if (!batchIterator.hasNext())
-throw new InvalidRecordException("Invalid batch for partition 
" + partition + " at offset " +
-batch.baseOffset() + " with control sequence set, but 
no records");
+return false;
 
 Record firstRecord = batchIterator.next();
 return ControlRecordType.ABORT == 
ControlRecordType.parse(firstRecord.key());
diff --git 
a/clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
 
b/clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
index 75a34ccf908..42f6beb16c2 100644
--- 
a/clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
+++ 
b/clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
@@ -2639,6 +2639,52 @@ private void verifySessionPartitions() {
 assertEquals(0, future.get());
 }
 
+@Test
+public void testEmptyControlBatch() {
+Fetcher fetcher = createFetcher(subscriptions, new 
Metrics(), new ByteArrayDeserializer(),
+new ByteArrayDeserializer(), Integer.MAX_VALUE, 
IsolationLevel.READ_COMMITTED);
+ByteBuffer buffer = ByteBuffer.allocate(1024);
+int currentOffset = 1;
+
+// Empty control batch should not cause an exception
+DefaultRecordBatch.writeEmptyHeader(buffer, 
RecordBatch.MAGIC_VALUE_V2, 1L,
+(short) 0, -1, 0, 0,
+RecordBatch.NO_PARTITION_LEADER_EPOCH, 
TimestampType.CREATE_TIME, time.milliseconds(),
+true, true);
+
+currentOffset += appendTransactionalRecords(buffer, 1L, currentOffset,
+new SimpleRecord(time.milliseconds(), "key".getBytes(), 
"value".getBytes()),
+new SimpleRecord(time.milliseconds(), "key".getBytes(), 
"value".getBytes()));
+
+commitTransaction(buffer, 1L, currentOffset);
+buffer.flip();
+
+List abortedTransactions = new 
ArrayList<>();
+MemoryRecords records = MemoryRecords.readableRecords(buffer);
+subscriptions.assignFromUser(singleton(tp0));
+
+subscriptions.seek(tp0, 0);
+
+// normal fetch
+assertEquals(1, fetcher.sendFetches());
+assertFalse(fetcher.hasCompletedFetches());
+client.prepareResponse(new MockClient.RequestMatcher() {
+@Override
+public boolean matches(AbstractRequest body) {
+FetchRequest request = (FetchRequest) body;
+assertEquals(IsolationLevel.READ_COMMITTED, 
request.isolationLevel());
+return true;
+}
+}, fullFetchResponseWithAbortedTransactions(records, 
abortedTransactions, Errors.NONE, 100L, 100L, 0));
+
+consumerClient.poll(time.timer(0));
+assertTrue(fetcher.hasCompletedFetches());
+
+Map>> 
fetchedRecords = fetcher.fetchedRecords();
+assertTrue(fetchedRecords.containsKey(tp0));
+assertEquals(fetchedRecords.get(tp0).size(), 2);
+}
+
 private MemoryRecords buildRecords(long baseOffset, int count, long 
firstMessageId) {
 MemoryRecordsBuilder builder = 
MemoryRecords.builder(ByteBuffer.allocate(1024), CompressionType.NONE, 
TimestampType.CREATE_TIME, baseOffset);
 for (int i = 0; i < count; i++)
diff --git 
a/core/src/main/scala/kafka/coordinator/group/GroupMetadataManager.scala 
b/core/src/main/scala/kafka/coordinator/group/GroupMetadataManager.scala
index 21f4a3dcad9..31cd3615675 100644
--- a/core/src/main/scala/kafka/coordinator/group/GroupMetadataManager.scala
+++ b/core/src/main/scala/kafka/coordinator/group/GroupMetadataManager.scala
@@ -555,17 +555,20 @@ class GroupMetadataManager(brokerId: Int,
   memRecords.batches.asScala.foreach 

[jira] [Assigned] (KAFKA-7485) Flaky test `DyanamicBrokerReconfigurationTest.testTrustStoreAlter`

2018-10-05 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson reassigned KAFKA-7485:
--

Assignee: Rajini Sivaram  (was: Jason Gustafson)

> Flaky test `DyanamicBrokerReconfigurationTest.testTrustStoreAlter`
> --
>
> Key: KAFKA-7485
> URL: https://issues.apache.org/jira/browse/KAFKA-7485
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Rajini Sivaram
>Priority: Major
>
> {code}
> 09:53:53 
> 09:53:53 kafka.server.DynamicBrokerReconfigurationTest > testTrustStoreAlter 
> FAILED
> 09:53:53 org.apache.kafka.common.errors.SslAuthenticationException: SSL 
> handshake failed
> 09:53:53 
> 09:53:53 Caused by:
> 09:53:53 javax.net.ssl.SSLProtocolException: Handshake message 
> sequence violation, 2
> 09:53:53 at 
> java.base/sun.security.ssl.Handshaker.checkThrown(Handshaker.java:1611)
> 09:53:53 at 
> java.base/sun.security.ssl.SSLEngineImpl.checkTaskThrown(SSLEngineImpl.java:497)
> 09:53:53 at 
> java.base/sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:745)
> 09:53:53 at 
> java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:680)
> 09:53:53 at 
> java.base/javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:626)
> 09:53:53 at 
> org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:474)
> 09:53:53 at 
> org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:274)
> 09:53:53 at 
> org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:126)
> 09:53:53 at 
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:532)
> 09:53:53 at 
> org.apache.kafka.common.network.Selector.poll(Selector.java:467)
> 09:53:53 at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:510)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:265)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:215)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:231)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:316)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1210)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1179)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1119)
> 09:53:53 at 
> kafka.server.DynamicBrokerReconfigurationTest.kafka$server$DynamicBrokerReconfigurationTest$$awaitInitialPositions(DynamicBrokerReconfigurationTest.scala:997)
> 09:53:53 at 
> kafka.server.DynamicBrokerReconfigurationTest$ConsumerBuilder.build(DynamicBrokerReconfigurationTest.scala:1424)
> 09:53:53 at 
> kafka.server.DynamicBrokerReconfigurationTest.verifySslProduceConsume$1(DynamicBrokerReconfigurationTest.scala:286)
> 09:53:53 at 
> kafka.server.DynamicBrokerReconfigurationTest.testTrustStoreAlter(DynamicBrokerReconfigurationTest.scala:311)
> 09:53:53 
> 09:53:53 Caused by:
> 09:53:53 javax.net.ssl.SSLProtocolException: Handshake message 
> sequence violation, 2
> 09:53:53 at 
> java.base/sun.security.ssl.HandshakeStateManager.check(HandshakeStateManager.java:398)
> 09:53:53 at 
> java.base/sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:215)
> 09:53:53 at 
> java.base/sun.security.ssl.Handshaker.processLoop(Handshaker.java:1098)
> 09:53:53 at 
> java.base/sun.security.ssl.Handshaker$1.run(Handshaker.java:1031)
> 09:53:53 at 
> java.base/sun.security.ssl.Handshaker$1.run(Handshaker.java:1028)
> 09:53:53 at 
> java.base/java.security.AccessController.doPrivileged(Native Method)
> 09:53:53 at 
> java.base/sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1540)
> 09:53:53 at 
> org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:399)
> 09:53:53 at 
> org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:479)
> 09:53:53 ... 17 

[jira] [Created] (KAFKA-7486) Flaky test `DeleteTopicTest.testAddPartitionDuringDeleteTopic`

2018-10-05 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-7486:
--

 Summary: Flaky test 
`DeleteTopicTest.testAddPartitionDuringDeleteTopic`
 Key: KAFKA-7486
 URL: https://issues.apache.org/jira/browse/KAFKA-7486
 Project: Kafka
  Issue Type: Bug
Reporter: Jason Gustafson


Starting to see more of this recently:
{code}
10:06:28 kafka.admin.DeleteTopicTest > testAddPartitionDuringDeleteTopic FAILED
10:06:28 kafka.admin.AdminOperationException: 
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /brokers/topics/test
10:06:28 at 
kafka.zk.AdminZkClient.writeTopicPartitionAssignment(AdminZkClient.scala:162)
10:06:28 at 
kafka.zk.AdminZkClient.createOrUpdateTopicPartitionAssignmentPathInZK(AdminZkClient.scala:102)
10:06:28 at 
kafka.zk.AdminZkClient.addPartitions(AdminZkClient.scala:229)
10:06:28 at 
kafka.admin.DeleteTopicTest.testAddPartitionDuringDeleteTopic(DeleteTopicTest.scala:266)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7485) Flaky test `DyanamicBrokerReconfigurationTest.testTrustStoreAlter`

2018-10-05 Thread Jason Gustafson (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640194#comment-16640194
 ] 

Jason Gustafson commented on KAFKA-7485:


Note the leaking consumer issue is fixed by 
https://github.com/apache/kafka/pull/5750.

> Flaky test `DyanamicBrokerReconfigurationTest.testTrustStoreAlter`
> --
>
> Key: KAFKA-7485
> URL: https://issues.apache.org/jira/browse/KAFKA-7485
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Major
>
> {code}
> 09:53:53 
> 09:53:53 kafka.server.DynamicBrokerReconfigurationTest > testTrustStoreAlter 
> FAILED
> 09:53:53 org.apache.kafka.common.errors.SslAuthenticationException: SSL 
> handshake failed
> 09:53:53 
> 09:53:53 Caused by:
> 09:53:53 javax.net.ssl.SSLProtocolException: Handshake message 
> sequence violation, 2
> 09:53:53 at 
> java.base/sun.security.ssl.Handshaker.checkThrown(Handshaker.java:1611)
> 09:53:53 at 
> java.base/sun.security.ssl.SSLEngineImpl.checkTaskThrown(SSLEngineImpl.java:497)
> 09:53:53 at 
> java.base/sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:745)
> 09:53:53 at 
> java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:680)
> 09:53:53 at 
> java.base/javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:626)
> 09:53:53 at 
> org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:474)
> 09:53:53 at 
> org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:274)
> 09:53:53 at 
> org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:126)
> 09:53:53 at 
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:532)
> 09:53:53 at 
> org.apache.kafka.common.network.Selector.poll(Selector.java:467)
> 09:53:53 at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:510)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:265)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:215)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:231)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:316)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1210)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1179)
> 09:53:53 at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1119)
> 09:53:53 at 
> kafka.server.DynamicBrokerReconfigurationTest.kafka$server$DynamicBrokerReconfigurationTest$$awaitInitialPositions(DynamicBrokerReconfigurationTest.scala:997)
> 09:53:53 at 
> kafka.server.DynamicBrokerReconfigurationTest$ConsumerBuilder.build(DynamicBrokerReconfigurationTest.scala:1424)
> 09:53:53 at 
> kafka.server.DynamicBrokerReconfigurationTest.verifySslProduceConsume$1(DynamicBrokerReconfigurationTest.scala:286)
> 09:53:53 at 
> kafka.server.DynamicBrokerReconfigurationTest.testTrustStoreAlter(DynamicBrokerReconfigurationTest.scala:311)
> 09:53:53 
> 09:53:53 Caused by:
> 09:53:53 javax.net.ssl.SSLProtocolException: Handshake message 
> sequence violation, 2
> 09:53:53 at 
> java.base/sun.security.ssl.HandshakeStateManager.check(HandshakeStateManager.java:398)
> 09:53:53 at 
> java.base/sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:215)
> 09:53:53 at 
> java.base/sun.security.ssl.Handshaker.processLoop(Handshaker.java:1098)
> 09:53:53 at 
> java.base/sun.security.ssl.Handshaker$1.run(Handshaker.java:1031)
> 09:53:53 at 
> java.base/sun.security.ssl.Handshaker$1.run(Handshaker.java:1028)
> 09:53:53 at 
> java.base/java.security.AccessController.doPrivileged(Native Method)
> 09:53:53 at 
> java.base/sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1540)
> 09:53:53 at 
> org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:399)
> 09:53:53 at 
> 

[jira] [Commented] (KAFKA-7415) OffsetsForLeaderEpoch may incorrectly respond with undefined epoch causing truncation to HW

2018-10-05 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640178#comment-16640178
 ] 

ASF GitHub Bot commented on KAFKA-7415:
---

hachikuji opened a new pull request #5749: KAFKA-7415; Persist leader epoch and 
start offset on becoming a leader
URL: https://github.com/apache/kafka/pull/5749
 
 
   Note this is a backport of #5678 for 1.1
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> OffsetsForLeaderEpoch may incorrectly respond with undefined epoch causing 
> truncation to HW
> ---
>
> Key: KAFKA-7415
> URL: https://issues.apache.org/jira/browse/KAFKA-7415
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 2.0.0
>Reporter: Anna Povzner
>Assignee: Jason Gustafson
>Priority: Major
> Fix For: 2.0.1, 2.1.0
>
>
> If the follower's last appended epoch is ahead of the leader's last appended 
> epoch, the OffsetsForLeaderEpoch response will incorrectly send 
> (UNDEFINED_EPOCH, UNDEFINED_EPOCH_OFFSET), and the follower will truncate to 
> HW. This may lead to data loss in some rare cases where 2 back-to-back leader 
> elections happen (failure of one leader, followed by quick re-election of the 
> next leader due to preferred leader election, so that all replicas are still 
> in the ISR, and then failure of the 3rd leader).
> The bug is in LeaderEpochFileCache.endOffsetFor(), which returns 
> (UNDEFINED_EPOCH, UNDEFINED_EPOCH_OFFSET) if the requested leader epoch is 
> ahead of the last leader epoch in the cache. The method should return (last 
> leader epoch in the cache, LEO) in this scenario.
> We don't create an entry in a leader epoch cache until a message is appended 
> with the new leader epoch. Every append to log calls 
> LeaderEpochFileCache.assign(). However, it would be much cleaner if 
> `makeLeader` created an entry in the cache as soon as replica becomes a 
> leader, which will fix the bug. In case the leader never appends any 
> messages, and the next leader epoch starts with the same offset, we already 
> have clearAndFlushLatest() that clears entries with start offsets greater or 
> equal to the passed offset. LeaderEpochFileCache.assign() could be merged 
> with clearAndFlushLatest(), so that we clear cache entries with offsets equal 
> or greater than the start offset of the new epoch, so that we do not need to 
> call these methods separately. 
>  
> Here is an example of a scenario where the issue leads to the data loss.
> Suppose we have three replicas: r1, r2, and r3. Initially, the ISR consists 
> of (r1, r2, r3) and the leader is r1. The data up to offset 10 has been 
> committed to the ISR. Here is the initial state:
> {code:java}
> Leader: r1
> leader epoch: 0
> ISR(r1, r2, r3)
> r1: [hw=10, leo=10]
> r2: [hw=8, leo=10]
> r3: [hw=5, leo=10]
> {code}
> Replica 1 fails and leaves the ISR, which makes Replica 2 the new leader with 
> leader epoch = 1. The leader appends a batch, but it is not replicated yet to 
> the followers.
> {code:java}
> Leader: r2
> leader epoch: 1
> ISR(r2, r3)
> r1: [hw=10, leo=10]
> r2: [hw=8, leo=11]
> r3: [hw=5, leo=10]
> {code}
> Replica 3 is elected a leader (due to preferred leader election) before it 
> has a chance to truncate, with leader epoch 2. 
> {code:java}
> Leader: r3
> leader epoch: 2
> ISR(r2, r3)
> r1: [hw=10, leo=10]
> r2: [hw=8, leo=11]
> r3: [hw=5, leo=10]
> {code}
> Replica 2 sends OffsetsForLeaderEpoch(leader epoch = 1) to Replica 3. Replica 
> 3 incorrectly replies with UNDEFINED_EPOCH_OFFSET, and Replica 2 truncates to 
> HW. If Replica 3 fails before Replica 2 re-fetches the data, this may lead to 
> data loss.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7483) Streams should allow headers to be passed to Serializer

2018-10-05 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640165#comment-16640165
 ] 

Matthias J. Sax commented on KAFKA-7483:


I don't understand this ticket.

1) There is not `RecordCollectorImpl#serializer()` method, thus, what do you 
want to "overload" ?

2) Via KIP-244 (as pointed out by John), you can already manipulate the 
headers: note, that the input record header will be forwarded to the output 
record, and thus, you can take the input record header (via context.header() 
and update it to contain the schema information for the output record.

Thus, I don't think we need to change anything, but it's already supported.

> Streams should allow headers to be passed to Serializer
> ---
>
> Key: KAFKA-7483
> URL: https://issues.apache.org/jira/browse/KAFKA-7483
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Kamal Chandraprakash
>Assignee: Kamal Chandraprakash
>Priority: Major
>
> We are storing schema metadata for record key and value in the header. 
> Serializer, includes this metadata in the record header. While doing simple 
> record transformation (x transformed to y) in streams, the same header that 
> was passed from source, pushed to the sink topic. This leads to error while 
> reading the sink topic.
> We should call the overloaded `serialize(topic, headers, object)` method in 
> [RecordCollectorImpl|https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordCollectorImpl.java#L156]
>  which in-turn adds the correct metadata in the record header.
> With this sink topic reader have the option to read all the values for a 
> header key using `Headers#headers`  [or] only the overwritten value using 
> `Headers#lastHeader`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7483) Streams should allow headers to be passed to Serializer

2018-10-05 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640181#comment-16640181
 ] 

Matthias J. Sax commented on KAFKA-7483:


I see. So it's a two line fix? Sure, we can do that. Sorry for the confusion.

> Streams should allow headers to be passed to Serializer
> ---
>
> Key: KAFKA-7483
> URL: https://issues.apache.org/jira/browse/KAFKA-7483
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Kamal Chandraprakash
>Assignee: Kamal Chandraprakash
>Priority: Major
>
> We are storing schema metadata for record key and value in the header. 
> Serializer, includes this metadata in the record header. While doing simple 
> record transformation (x transformed to y) in streams, the same header that 
> was passed from source, pushed to the sink topic. This leads to error while 
> reading the sink topic.
> We should call the overloaded `serialize(topic, headers, object)` method in 
> [RecordCollectorImpl|https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordCollectorImpl.java#L156]
>  which in-turn adds the correct metadata in the record header.
> With this sink topic reader have the option to read all the values for a 
> header key using `Headers#headers`  [or] only the overwritten value using 
> `Headers#lastHeader`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7483) Streams should allow headers to be passed to Serializer

2018-10-05 Thread Guozhang Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640174#comment-16640174
 ] 

Guozhang Wang commented on KAFKA-7483:
--

[~mjsax] I think [~ckamal] was referring to `Serializer#serialize(String topic, 
Headers headers, T data)`, i.e. to let RecordCollectorImpl.send() to call this 
overload function than `serialize(String topic, T data)`.

> Streams should allow headers to be passed to Serializer
> ---
>
> Key: KAFKA-7483
> URL: https://issues.apache.org/jira/browse/KAFKA-7483
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Kamal Chandraprakash
>Assignee: Kamal Chandraprakash
>Priority: Major
>
> We are storing schema metadata for record key and value in the header. 
> Serializer, includes this metadata in the record header. While doing simple 
> record transformation (x transformed to y) in streams, the same header that 
> was passed from source, pushed to the sink topic. This leads to error while 
> reading the sink topic.
> We should call the overloaded `serialize(topic, headers, object)` method in 
> [RecordCollectorImpl|https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordCollectorImpl.java#L156]
>  which in-turn adds the correct metadata in the record header.
> With this sink topic reader have the option to read all the values for a 
> header key using `Headers#headers`  [or] only the overwritten value using 
> `Headers#lastHeader`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7485) Flaky test `DyanamicBrokerReconfigurationTest.testTrustStoreAlter`

2018-10-05 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-7485:
--

 Summary: Flaky test 
`DyanamicBrokerReconfigurationTest.testTrustStoreAlter`
 Key: KAFKA-7485
 URL: https://issues.apache.org/jira/browse/KAFKA-7485
 Project: Kafka
  Issue Type: Bug
Reporter: Jason Gustafson
Assignee: Jason Gustafson


{code}
09:53:53 
09:53:53 kafka.server.DynamicBrokerReconfigurationTest > testTrustStoreAlter 
FAILED
09:53:53 org.apache.kafka.common.errors.SslAuthenticationException: SSL 
handshake failed
09:53:53 
09:53:53 Caused by:
09:53:53 javax.net.ssl.SSLProtocolException: Handshake message sequence 
violation, 2
09:53:53 at 
java.base/sun.security.ssl.Handshaker.checkThrown(Handshaker.java:1611)
09:53:53 at 
java.base/sun.security.ssl.SSLEngineImpl.checkTaskThrown(SSLEngineImpl.java:497)
09:53:53 at 
java.base/sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:745)
09:53:53 at 
java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:680)
09:53:53 at 
java.base/javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:626)
09:53:53 at 
org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:474)
09:53:53 at 
org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:274)
09:53:53 at 
org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:126)
09:53:53 at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:532)
09:53:53 at 
org.apache.kafka.common.network.Selector.poll(Selector.java:467)
09:53:53 at 
org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:510)
09:53:53 at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:265)
09:53:53 at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236)
09:53:53 at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:215)
09:53:53 at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:231)
09:53:53 at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:316)
09:53:53 at 
org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1210)
09:53:53 at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1179)
09:53:53 at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1119)
09:53:53 at 
kafka.server.DynamicBrokerReconfigurationTest.kafka$server$DynamicBrokerReconfigurationTest$$awaitInitialPositions(DynamicBrokerReconfigurationTest.scala:997)
09:53:53 at 
kafka.server.DynamicBrokerReconfigurationTest$ConsumerBuilder.build(DynamicBrokerReconfigurationTest.scala:1424)
09:53:53 at 
kafka.server.DynamicBrokerReconfigurationTest.verifySslProduceConsume$1(DynamicBrokerReconfigurationTest.scala:286)
09:53:53 at 
kafka.server.DynamicBrokerReconfigurationTest.testTrustStoreAlter(DynamicBrokerReconfigurationTest.scala:311)
09:53:53 
09:53:53 Caused by:
09:53:53 javax.net.ssl.SSLProtocolException: Handshake message 
sequence violation, 2
09:53:53 at 
java.base/sun.security.ssl.HandshakeStateManager.check(HandshakeStateManager.java:398)
09:53:53 at 
java.base/sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:215)
09:53:53 at 
java.base/sun.security.ssl.Handshaker.processLoop(Handshaker.java:1098)
09:53:53 at 
java.base/sun.security.ssl.Handshaker$1.run(Handshaker.java:1031)
09:53:53 at 
java.base/sun.security.ssl.Handshaker$1.run(Handshaker.java:1028)
09:53:53 at 
java.base/java.security.AccessController.doPrivileged(Native Method)
09:53:53 at 
java.base/sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1540)
09:53:53 at 
org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:399)
09:53:53 at 
org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:479)
09:53:53 ... 17 more
{code}

Also, it seems we might not be cleaning up the consumer because I see a bunch 
of subsequent failures due to lingering consumer heartbeat thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7484) Fix test SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()

2018-10-05 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-7484:
---
Affects Version/s: 2.1.0

> Fix test 
> SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()
> 
>
> Key: KAFKA-7484
> URL: https://issues.apache.org/jira/browse/KAFKA-7484
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.1.0
>Reporter: Dong Lin
>Assignee: John Roesler
>Priority: Major
>
> The test 
> SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown() fails 
> in the 2.1.0 branch Jekin job. See 
> [https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/.|https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/]
> Here is the stack trace: 
> java.lang.AssertionError: Condition not met within timeout 3. Did not 
> receive all 3 records from topic output-raw-shouldRecoverBufferAfterShutdown 
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:278) at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:462)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:343)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.verifyKeyValueTimestamps(IntegrationTestUtils.java:543)
>  at 
> org.apache.kafka.streams.integration.SuppressionDurabilityIntegrationTest.verifyOutput(SuppressionDurabilityIntegrationTest.java:239)
>  at 
> org.apache.kafka.streams.integration.SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown(SuppressionDurabilityIntegrationTest.java:206)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
> org.junit.runners.Suite.runChild(Suite.java:128) at 
> org.junit.runners.Suite.runChild(Suite.java:27) at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-7484) Fix test SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()

2018-10-05 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler reassigned KAFKA-7484:
---

Assignee: John Roesler

> Fix test 
> SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()
> 
>
> Key: KAFKA-7484
> URL: https://issues.apache.org/jira/browse/KAFKA-7484
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, unit tests
>Reporter: Dong Lin
>Assignee: John Roesler
>Priority: Major
>
> The test 
> SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown() fails 
> in the 2.1.0 branch Jekin job. See 
> [https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/.|https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/]
> Here is the stack trace: 
> java.lang.AssertionError: Condition not met within timeout 3. Did not 
> receive all 3 records from topic output-raw-shouldRecoverBufferAfterShutdown 
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:278) at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:462)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:343)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.verifyKeyValueTimestamps(IntegrationTestUtils.java:543)
>  at 
> org.apache.kafka.streams.integration.SuppressionDurabilityIntegrationTest.verifyOutput(SuppressionDurabilityIntegrationTest.java:239)
>  at 
> org.apache.kafka.streams.integration.SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown(SuppressionDurabilityIntegrationTest.java:206)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
> org.junit.runners.Suite.runChild(Suite.java:128) at 
> org.junit.runners.Suite.runChild(Suite.java:27) at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7484) Fix test SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()

2018-10-05 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-7484:

Component/s: (was: unit tests)
 Issue Type: Bug  (was: Improvement)

> Fix test 
> SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()
> 
>
> Key: KAFKA-7484
> URL: https://issues.apache.org/jira/browse/KAFKA-7484
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Dong Lin
>Assignee: John Roesler
>Priority: Major
>
> The test 
> SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown() fails 
> in the 2.1.0 branch Jekin job. See 
> [https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/.|https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/]
> Here is the stack trace: 
> java.lang.AssertionError: Condition not met within timeout 3. Did not 
> receive all 3 records from topic output-raw-shouldRecoverBufferAfterShutdown 
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:278) at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:462)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:343)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.verifyKeyValueTimestamps(IntegrationTestUtils.java:543)
>  at 
> org.apache.kafka.streams.integration.SuppressionDurabilityIntegrationTest.verifyOutput(SuppressionDurabilityIntegrationTest.java:239)
>  at 
> org.apache.kafka.streams.integration.SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown(SuppressionDurabilityIntegrationTest.java:206)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
> org.junit.runners.Suite.runChild(Suite.java:128) at 
> org.junit.runners.Suite.runChild(Suite.java:27) at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-5117) Kafka Connect REST endpoints reveal Password typed values

2018-10-05 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-5117.
--
   Resolution: Duplicate
 Assignee: Ewen Cheslack-Postava
Fix Version/s: 2.0.0

> Kafka Connect REST endpoints reveal Password typed values
> -
>
> Key: KAFKA-5117
> URL: https://issues.apache.org/jira/browse/KAFKA-5117
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Thomas Holmes
>Assignee: Ewen Cheslack-Postava
>Priority: Major
>  Labels: needs-kip
> Fix For: 2.0.0
>
>
> A Kafka Connect connector can specify ConfigDef keys as type of Password. 
> This type was added to prevent logging the values (instead "[hidden]" is 
> logged).
> This change does not apply to the values returned by executing a GET on 
> {{connectors/\{connector-name\}}} and 
> {{connectors/\{connector-name\}/config}}. This creates an easily accessible 
> way for an attacker who has infiltrated your network to gain access to 
> potential secrets that should not be available.
> I have started on a code change that addresses this issue by parsing the 
> config values through the ConfigDef for the connector and returning their 
> output instead (which leads to the masking of Password typed configs as 
> [hidden]).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-5117) Kafka Connect REST endpoints reveal Password typed values

2018-10-05 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640116#comment-16640116
 ] 

Ewen Cheslack-Postava commented on KAFKA-5117:
--

Going to close this since 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-297%3A+Externalizing+Secrets+for+Connect+Configurations]
 addresses this problem. Feel free to reopen if that doesn't sufficiently 
address the issue.

> Kafka Connect REST endpoints reveal Password typed values
> -
>
> Key: KAFKA-5117
> URL: https://issues.apache.org/jira/browse/KAFKA-5117
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Thomas Holmes
>Priority: Major
>  Labels: needs-kip
>
> A Kafka Connect connector can specify ConfigDef keys as type of Password. 
> This type was added to prevent logging the values (instead "[hidden]" is 
> logged).
> This change does not apply to the values returned by executing a GET on 
> {{connectors/\{connector-name\}}} and 
> {{connectors/\{connector-name\}/config}}. This creates an easily accessible 
> way for an attacker who has infiltrated your network to gain access to 
> potential secrets that should not be available.
> I have started on a code change that addresses this issue by parsing the 
> config values through the ConfigDef for the connector and returning their 
> output instead (which leads to the masking of Password typed configs as 
> [hidden]).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7461) Connect Values converter should have coverage of logical types

2018-10-05 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640101#comment-16640101
 ] 

Ewen Cheslack-Postava commented on KAFKA-7461:
--

[~ritabrata1808] I didn't look carefully at which types were missing coverage, 
but yeah, that looks like the right general direction.

> Connect Values converter should have coverage of logical types
> --
>
> Key: KAFKA-7461
> URL: https://issues.apache.org/jira/browse/KAFKA-7461
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 1.1.1, 2.0.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
>Priority: Blocker
>  Labels: newbie, test
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Per fix from KAFKA-7460, we've got some gaps in testing for the Values 
> converter added in KIP-145, in particular for logical types. It looks like 
> there are a few other gaps (e.g. from quick scan of coverage, maybe the float 
> types as well), but logical types seem to be the bulk other than trivial 
> wrapper methods.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7484) Fix test SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()

2018-10-05 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640091#comment-16640091
 ] 

ASF GitHub Bot commented on KAFKA-7484:
---

vvcephei opened a new pull request #5748: KAFKA-7484: fix suppression 
integration tests
URL: https://github.com/apache/kafka/pull/5748
 
 
   The test data is based at time 0 (aka 1970), but the broker was started at 
time `now`.
   This caused the log cleaner to delete the test data if the tests took 
slightly too long, despite its generous default configuration.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix test 
> SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()
> 
>
> Key: KAFKA-7484
> URL: https://issues.apache.org/jira/browse/KAFKA-7484
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, unit tests
>Reporter: Dong Lin
>Priority: Major
>
> The test 
> SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown() fails 
> in the 2.1.0 branch Jekin job. See 
> [https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/.|https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/]
> Here is the stack trace: 
> java.lang.AssertionError: Condition not met within timeout 3. Did not 
> receive all 3 records from topic output-raw-shouldRecoverBufferAfterShutdown 
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:278) at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:462)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:343)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.verifyKeyValueTimestamps(IntegrationTestUtils.java:543)
>  at 
> org.apache.kafka.streams.integration.SuppressionDurabilityIntegrationTest.verifyOutput(SuppressionDurabilityIntegrationTest.java:239)
>  at 
> org.apache.kafka.streams.integration.SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown(SuppressionDurabilityIntegrationTest.java:206)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
> org.junit.runners.Suite.runChild(Suite.java:128) at 
> org.junit.runners.Suite.runChild(Suite.java:27) at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
> 

[jira] [Commented] (KAFKA-7266) Fix MetricsTest test flakiness

2018-10-05 Thread Stanislav Kozlovski (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640085#comment-16640085
 ] 

Stanislav Kozlovski commented on KAFKA-7266:


Thanks. Sorry for not marking it the first time around

> Fix MetricsTest test flakiness
> --
>
> Key: KAFKA-7266
> URL: https://issues.apache.org/jira/browse/KAFKA-7266
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Stanislav Kozlovski
>Assignee: Stanislav Kozlovski
>Priority: Minor
> Fix For: 2.1.0, 2.2.0
>
>
> The test `kafka.api.MetricsTest.testMetrics` has been failing intermittently 
> in kafka builds (recent proof: 
> https://github.com/apache/kafka/pull/5436#issuecomment-409683955)
> The particular failure is in the `MessageConversionsTimeMs` metric assertion -
> {code}
> java.lang.AssertionError: Message conversion time not recorded 0.0
> {code}
> There has been work done previously 
> (https://github.com/apache/kafka/pull/4681) to combat the flakiness of the 
> test and while it has improved it, the test still fails sometimes.
> h3. Solution
> On my machine, the test failed 5 times out of 25 runs. Increasing the record 
> size and using compression should slow down message conversion enough to have 
> it be above 1ms. Locally this test has not failed in 200 runs and counting 
> with those changes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7483) Streams should allow headers to be passed to Serializer

2018-10-05 Thread Guozhang Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640073#comment-16640073
 ] 

Guozhang Wang commented on KAFKA-7483:
--

[~ckamal] I think this is a good one to add and should be straight-forward as 
well, compatibility wise I do not seen any obvious issues as we have default 
impl of `serialize(String topic, Headers headers, T data)` to ignore the header 
so for users who do not serialize the headers anyways they should not see any 
surprises.

Would you like to submit a PR for this, along with some unit test to make sure 
the logic is added properly?

> Streams should allow headers to be passed to Serializer
> ---
>
> Key: KAFKA-7483
> URL: https://issues.apache.org/jira/browse/KAFKA-7483
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Kamal Chandraprakash
>Assignee: Kamal Chandraprakash
>Priority: Major
>
> We are storing schema metadata for record key and value in the header. 
> Serializer, includes this metadata in the record header. While doing simple 
> record transformation (x transformed to y) in streams, the same header that 
> was passed from source, pushed to the sink topic. This leads to error while 
> reading the sink topic.
> We should call the overloaded `serialize(topic, headers, object)` method in 
> [RecordCollectorImpl|https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordCollectorImpl.java#L156]
>  which in-turn adds the correct metadata in the record header.
> With this sink topic reader have the option to read all the values for a 
> header key using `Headers#headers`  [or] only the overwritten value using 
> `Headers#lastHeader`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7266) Fix MetricsTest test flakiness

2018-10-05 Thread Ismael Juma (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640040#comment-16640040
 ] 

Ismael Juma commented on KAFKA-7266:


[~enether], the JIRA was not marked as fixed, that's why. I did it now.

> Fix MetricsTest test flakiness
> --
>
> Key: KAFKA-7266
> URL: https://issues.apache.org/jira/browse/KAFKA-7266
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Stanislav Kozlovski
>Assignee: Stanislav Kozlovski
>Priority: Minor
> Fix For: 2.1.0, 2.2.0
>
>
> The test `kafka.api.MetricsTest.testMetrics` has been failing intermittently 
> in kafka builds (recent proof: 
> https://github.com/apache/kafka/pull/5436#issuecomment-409683955)
> The particular failure is in the `MessageConversionsTimeMs` metric assertion -
> {code}
> java.lang.AssertionError: Message conversion time not recorded 0.0
> {code}
> There has been work done previously 
> (https://github.com/apache/kafka/pull/4681) to combat the flakiness of the 
> test and while it has improved it, the test still fails sometimes.
> h3. Solution
> On my machine, the test failed 5 times out of 25 runs. Increasing the record 
> size and using compression should slow down message conversion enough to have 
> it be above 1ms. Locally this test has not failed in 200 runs and counting 
> with those changes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7266) Fix MetricsTest test flakiness

2018-10-05 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-7266.

   Resolution: Fixed
Fix Version/s: 2.1.0

> Fix MetricsTest test flakiness
> --
>
> Key: KAFKA-7266
> URL: https://issues.apache.org/jira/browse/KAFKA-7266
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Stanislav Kozlovski
>Assignee: Stanislav Kozlovski
>Priority: Minor
> Fix For: 2.1.0, 2.2.0
>
>
> The test `kafka.api.MetricsTest.testMetrics` has been failing intermittently 
> in kafka builds (recent proof: 
> https://github.com/apache/kafka/pull/5436#issuecomment-409683955)
> The particular failure is in the `MessageConversionsTimeMs` metric assertion -
> {code}
> java.lang.AssertionError: Message conversion time not recorded 0.0
> {code}
> There has been work done previously 
> (https://github.com/apache/kafka/pull/4681) to combat the flakiness of the 
> test and while it has improved it, the test still fails sometimes.
> h3. Solution
> On my machine, the test failed 5 times out of 25 runs. Increasing the record 
> size and using compression should slow down message conversion enough to have 
> it be above 1ms. Locally this test has not failed in 200 runs and counting 
> with those changes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7266) Fix MetricsTest test flakiness

2018-10-05 Thread Stanislav Kozlovski (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640035#comment-16640035
 ] 

Stanislav Kozlovski commented on KAFKA-7266:


Hey [~lindong], is there any reason this can't go into `2.1.0`?

> Fix MetricsTest test flakiness
> --
>
> Key: KAFKA-7266
> URL: https://issues.apache.org/jira/browse/KAFKA-7266
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Stanislav Kozlovski
>Assignee: Stanislav Kozlovski
>Priority: Minor
> Fix For: 2.2.0
>
>
> The test `kafka.api.MetricsTest.testMetrics` has been failing intermittently 
> in kafka builds (recent proof: 
> https://github.com/apache/kafka/pull/5436#issuecomment-409683955)
> The particular failure is in the `MessageConversionsTimeMs` metric assertion -
> {code}
> java.lang.AssertionError: Message conversion time not recorded 0.0
> {code}
> There has been work done previously 
> (https://github.com/apache/kafka/pull/4681) to combat the flakiness of the 
> test and while it has improved it, the test still fails sometimes.
> h3. Solution
> On my machine, the test failed 5 times out of 25 runs. Increasing the record 
> size and using compression should slow down message conversion enough to have 
> it be above 1ms. Locally this test has not failed in 200 runs and counting 
> with those changes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-7266) Fix MetricsTest test flakiness

2018-10-05 Thread Stanislav Kozlovski (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640035#comment-16640035
 ] 

Stanislav Kozlovski edited comment on KAFKA-7266 at 10/5/18 4:19 PM:
-

Hey [~lindong], is there any reason this can't go into `2.1.0`? It is in 
`trunk` since Aug 13 and the `2.1.0` branch isn't cut yet, so I assume it'll be 
cut from `trunk` and actually go into `2.1.0` by default?


was (Author: enether):
Hey [~lindong], is there any reason this can't go into `2.1.0`?

> Fix MetricsTest test flakiness
> --
>
> Key: KAFKA-7266
> URL: https://issues.apache.org/jira/browse/KAFKA-7266
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Stanislav Kozlovski
>Assignee: Stanislav Kozlovski
>Priority: Minor
> Fix For: 2.2.0
>
>
> The test `kafka.api.MetricsTest.testMetrics` has been failing intermittently 
> in kafka builds (recent proof: 
> https://github.com/apache/kafka/pull/5436#issuecomment-409683955)
> The particular failure is in the `MessageConversionsTimeMs` metric assertion -
> {code}
> java.lang.AssertionError: Message conversion time not recorded 0.0
> {code}
> There has been work done previously 
> (https://github.com/apache/kafka/pull/4681) to combat the flakiness of the 
> test and while it has improved it, the test still fails sometimes.
> h3. Solution
> On my machine, the test failed 5 times out of 25 runs. Increasing the record 
> size and using compression should slow down message conversion enough to have 
> it be above 1ms. Locally this test has not failed in 200 runs and counting 
> with those changes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7484) Fix test SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()

2018-10-05 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-7484:
---
Component/s: unit tests
 streams

> Fix test 
> SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()
> 
>
> Key: KAFKA-7484
> URL: https://issues.apache.org/jira/browse/KAFKA-7484
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, unit tests
>Reporter: Dong Lin
>Priority: Major
>
> The test 
> SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown() fails 
> in the 2.1.0 branch Jekin job. See 
> [https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/.|https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/]
> Here is the stack trace: 
> java.lang.AssertionError: Condition not met within timeout 3. Did not 
> receive all 3 records from topic output-raw-shouldRecoverBufferAfterShutdown 
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:278) at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:462)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:343)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.verifyKeyValueTimestamps(IntegrationTestUtils.java:543)
>  at 
> org.apache.kafka.streams.integration.SuppressionDurabilityIntegrationTest.verifyOutput(SuppressionDurabilityIntegrationTest.java:239)
>  at 
> org.apache.kafka.streams.integration.SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown(SuppressionDurabilityIntegrationTest.java:206)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
> org.junit.runners.Suite.runChild(Suite.java:128) at 
> org.junit.runners.Suite.runChild(Suite.java:27) at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7274) Incorrect subject credential used in inter-broker communication

2018-10-05 Thread TAO XIAO (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640008#comment-16640008
 ] 

TAO XIAO commented on KAFKA-7274:
-

[~rsivaram] Can I assume that static jaas config only works for multi sasl 
mechanisms if and only if  username/password pair is only required by one of 
the mechanisms? If this is the valid case it is better to document this 
limitation to make thing clear

> Incorrect subject credential used in inter-broker communication
> ---
>
> Key: KAFKA-7274
> URL: https://issues.apache.org/jira/browse/KAFKA-7274
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 1.0.0, 1.0.1, 1.0.2, 1.1.0, 1.1.1, 2.0.0
>Reporter: TAO XIAO
>Priority: Major
>
> We configured one broker setup to enable multiple SASL mechanisms using JAAS 
> config file but we failed to start up the broker.
>  
> Here is security section of server.properties
>  
> {{listeners=SASL_PLAINTEXT://:9092
> security.inter.broker.protocol=SASL_PLAINTEXT
> sasl.enabled.mechanisms=PLAIN,SCRAM-SHA-256
> sasl.mechanism.inter.broker.protocol=PLAIN}}{{}}
>  
> JAAS file
>  
> {noformat}
> sasl_plaintext.KafkaServer {
>   org.apache.kafka.common.security.plain.PlainLoginModule required
>   username="admin"
>   password="admin-secret"
>   user_admin="admin-secret"
>   user_alice="alice-secret";
>   org.apache.kafka.common.security.scram.ScramLoginModule required
>   username="admin1"
>   password="admin-secret";
> };{noformat}
>  
> Exception we got
>  
> {noformat}
> [2018-08-10 12:12:13,070] ERROR [Controller id=0, targetBrokerId=0] 
> Connection to node 0 failed authentication due to: Authentication failed: 
> Invalid username or password 
> (org.apache.kafka.clients.NetworkClient){noformat}
>  
> If we changed to use broker configuration property we can start broker 
> successfully
>  
> {noformat}
> listeners=SASL_PLAINTEXT://:9092
> security.inter.broker.protocol=SASL_PLAINTEXT
> sasl.enabled.mechanisms=PLAIN,SCRAM-SHA-256
> sasl.mechanism.inter.broker.protocol=PLAIN
> listener.name.sasl_plaintext.plain.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule
>  required username="admin" password="admin-secret" user_admin="admin-secret" 
> user_alice="alice-secret";
> listener.name.sasl_plaintext.scram-sha-256.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule
>  required username="admin1" password="admin-secret";{noformat}
>  
> I believe this issue is caused by Kafka assigning all login modules to each 
> defined mechanism when using JAAS file which results in Login class to add 
> both username defined in each login module to the same subject
> [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/security/JaasContext.java#L101]
>  
> [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/security/authenticator/LoginManager.java#L63]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7483) Streams should allow headers to be passed to Serializer

2018-10-05 Thread John Roesler (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639930#comment-16639930
 ] 

John Roesler commented on KAFKA-7483:
-

Hi [~ckamal],

Thanks for the feature request!

This was the recent change (2.0) that allowed record headers to flow through 
the topology:
 * 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-244%3A+Add+Record+Header+support+to+Kafka+Streams+Processor+API]
 
 * aka KAFKA-6850

It seems like this request is a straightforward extension of that work

It also doesn't seem like this would affect any public APIs, thus it would not 
require a KIP.

 

Offhand, it does seem like a simple, beneficial change... As you noted, we'd 
just call the extended serializer method in RecordCollectorImpl.

It would probably be good to search the Streams codebase for other invocations 
of `serialize` just to make sure the feature is airtight.

 

 

If you want to, you're welcome to send a PR with the proposed change!

 

Thanks,

-John

> Streams should allow headers to be passed to Serializer
> ---
>
> Key: KAFKA-7483
> URL: https://issues.apache.org/jira/browse/KAFKA-7483
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Kamal Chandraprakash
>Assignee: Kamal Chandraprakash
>Priority: Major
>
> We are storing schema metadata for record key and value in the header. 
> Serializer, includes this metadata in the record header. While doing simple 
> record transformation (x transformed to y) in streams, the same header that 
> was passed from source, pushed to the sink topic. This leads to error while 
> reading the sink topic.
> We should call the overloaded `serialize(topic, headers, object)` method in 
> [RecordCollectorImpl|https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordCollectorImpl.java#L156]
>  which in-turn adds the correct metadata in the record header.
> With this sink topic reader have the option to read all the values for a 
> header key using `Headers#headers`  [or] only the overwritten value using 
> `Headers#lastHeader`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7477) Improve Streams close timeout semantics

2018-10-05 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639765#comment-16639765
 ] 

ASF GitHub Bot commented on KAFKA-7477:
---

nizhikov opened a new pull request #5747: KAFKA-7477
URL: https://github.com/apache/kafka/pull/5747
 
 
   Second part of 
[KIP-358](https://cwiki.apache.org/confluence/display/KAFKA/KIP-358%3A+Migrate+Streams+API+to+Duration+instead+of+long+ms+times).
   
   This changes based on [previous PR 
discussion](https://github.com/apache/kafka/pull/5682#discussion_r221473451)
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve Streams close timeout semantics
> ---
>
> Key: KAFKA-7477
> URL: https://issues.apache.org/jira/browse/KAFKA-7477
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: John Roesler
>Assignee: Nikolay Izhikov
>Priority: Minor
>  Labels: kip, newbie
>
> See [https://github.com/apache/kafka/pull/5682#discussion_r221473451]
> The current timeout semantics are a little "magical":
>  * 0 means to block forever
>  * negative numbers cause the close to complete immediately without checking 
> the state
> I think this would make more sense:
>  * reject negative numbers
>  * make 0 just signal and return immediately (after checking the state once)
>  * if I want to wait "forever", I can use {{ofYears(1)}} or 
> {{ofMillis(Long.MAX_VALUE)}} or some other intuitively "long enough to be 
> forever" value instead of a magic value.
>  
> Part of 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-358%3A+Migrate+Streams+API+to+Duration+instead+of+long+ms+times



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-3861) Shrunk ISR before leader crash makes the partition unavailable

2018-10-05 Thread Nico Meyer (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639746#comment-16639746
 ] 

Nico Meyer edited comment on KAFKA-3861 at 10/5/18 12:16 PM:
-

We ran into this same issue this weekend. It was triggered by a faulty disk, 
which had to be replaced. I don't see a way to recover from this properly. 
Enabling unclean leader election would help against downtime, but could lead to 
data loss if any of the remaining followers is behind. I think the scheme 
describe by [~maysamyabandeh] is a good idea for unclean leader elections in 
general, at least as an option. In the very least there should be a tool for 
admins to manually choose the best replica to use as the new leader.


was (Author: nico.meyer):
We ran into this same issue this weekend. It was triggered by a faulty disk, 
which had to be replaced. I don't see a way to recover from this properly. 
Enabling unclean leader election would help against downtime, but could lead to 
data loss if any of the remaining followers is behind. I think the scheme 
describe by [~maysamyabandeh] is a good idea for unclean leader elections in 
general, at least as an option.

> Shrunk ISR before leader crash makes the partition unavailable
> --
>
> Key: KAFKA-3861
> URL: https://issues.apache.org/jira/browse/KAFKA-3861
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 0.10.0.0
>Reporter: Maysam Yabandeh
>Priority: Major
>
> We observed a case that the leader experienced a crash and lost its in-memory 
> data and latest HW offsets. Normally Kafka should be safe and be able to make 
> progress with a single node failure. However a few seconds before the crash 
> the leader shrunk its ISR to itself, which is safe since min-in-sync-replicas 
> is 2 and replication factor is 3 thus the troubled leader cannot accept new 
> produce messages. After the crash however the controller could not name any 
> of the of the followers as the new leader since as far as the controller 
> knows they are not in ISR and could potentially be behind the last leader. 
> Note that unclean-leader-election is disabled in this cluster since the 
> cluster requires a very high degree of durability and cannot tolerate data 
> loss.
> The impact could get worse if the admin brings up the crashed broker in an 
> attempt to make such partitions available again; this would take down even 
> more brokers as the followers panic when they find their offset larger than 
> HW offset in the leader:
> {code}
> if (leaderEndOffset < replica.logEndOffset.messageOffset) {
>   // Prior to truncating the follower's log, ensure that doing so is not 
> disallowed by the configuration for unclean leader election.
>   // This situation could only happen if the unclean election 
> configuration for a topic changes while a replica is down. Otherwise,
>   // we should never encounter this situation since a non-ISR leader 
> cannot be elected if disallowed by the broker configuration.
>   if (!LogConfig.fromProps(brokerConfig.originals, 
> AdminUtils.fetchEntityConfig(replicaMgr.zkUtils,
> ConfigType.Topic, 
> topicAndPartition.topic)).uncleanLeaderElectionEnable) {
> // Log a fatal error and shutdown the broker to ensure that data loss 
> does not unexpectedly occur.
> fatal("Halting because log truncation is not allowed for topic 
> %s,".format(topicAndPartition.topic) +
>   " Current leader %d's latest offset %d is less than replica %d's 
> latest offset %d"
>   .format(sourceBroker.id, leaderEndOffset, brokerConfig.brokerId, 
> replica.logEndOffset.messageOffset))
> Runtime.getRuntime.halt(1)
>   }
> {code}
> One hackish solution would be that the admin investigates the logs, determine 
> that unclean-leader-election in this particular case would be safe and 
> temporarily enables it (while the crashed node is down) until new leaders are 
> selected for affected partitions, wait for the topics LEO advances far enough 
> and then brings up the crashed node again. This manual process is however 
> slow and error-prone and the cluster will suffer partial unavailability in 
> the meanwhile.
> We are thinking of having the controller make an exception for this case: if 
> ISR size is less than min-in-sync-replicas and the new leader would be -1, 
> then the controller does an RPC to all the replicas and inquire of the latest 
> offset, and if all the replicas responded then chose the one with the largest 
> offset as the leader as well as the new ISR. Note that the controller cannot 
> do that if any of the non-leader replicas do not respond since there might be 
> a case that the responding replicas have not been involved the last ISR and 
> 

[jira] [Commented] (KAFKA-3861) Shrunk ISR before leader crash makes the partition unavailable

2018-10-05 Thread Nico Meyer (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639746#comment-16639746
 ] 

Nico Meyer commented on KAFKA-3861:
---

We ran into this same issue this weekend. It was triggered by a faulty disk, 
which had to be replaced. I don't see a way to recover from this properly. 
Enabling unclean leader election would help against downtime, but could lead to 
data loss if any of the remaining followers is behind. I think the scheme 
describe by [~maysamyabandeh] is a good idea for unclean leader elections in 
general, at least as an option.

> Shrunk ISR before leader crash makes the partition unavailable
> --
>
> Key: KAFKA-3861
> URL: https://issues.apache.org/jira/browse/KAFKA-3861
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 0.10.0.0
>Reporter: Maysam Yabandeh
>Priority: Major
>
> We observed a case that the leader experienced a crash and lost its in-memory 
> data and latest HW offsets. Normally Kafka should be safe and be able to make 
> progress with a single node failure. However a few seconds before the crash 
> the leader shrunk its ISR to itself, which is safe since min-in-sync-replicas 
> is 2 and replication factor is 3 thus the troubled leader cannot accept new 
> produce messages. After the crash however the controller could not name any 
> of the of the followers as the new leader since as far as the controller 
> knows they are not in ISR and could potentially be behind the last leader. 
> Note that unclean-leader-election is disabled in this cluster since the 
> cluster requires a very high degree of durability and cannot tolerate data 
> loss.
> The impact could get worse if the admin brings up the crashed broker in an 
> attempt to make such partitions available again; this would take down even 
> more brokers as the followers panic when they find their offset larger than 
> HW offset in the leader:
> {code}
> if (leaderEndOffset < replica.logEndOffset.messageOffset) {
>   // Prior to truncating the follower's log, ensure that doing so is not 
> disallowed by the configuration for unclean leader election.
>   // This situation could only happen if the unclean election 
> configuration for a topic changes while a replica is down. Otherwise,
>   // we should never encounter this situation since a non-ISR leader 
> cannot be elected if disallowed by the broker configuration.
>   if (!LogConfig.fromProps(brokerConfig.originals, 
> AdminUtils.fetchEntityConfig(replicaMgr.zkUtils,
> ConfigType.Topic, 
> topicAndPartition.topic)).uncleanLeaderElectionEnable) {
> // Log a fatal error and shutdown the broker to ensure that data loss 
> does not unexpectedly occur.
> fatal("Halting because log truncation is not allowed for topic 
> %s,".format(topicAndPartition.topic) +
>   " Current leader %d's latest offset %d is less than replica %d's 
> latest offset %d"
>   .format(sourceBroker.id, leaderEndOffset, brokerConfig.brokerId, 
> replica.logEndOffset.messageOffset))
> Runtime.getRuntime.halt(1)
>   }
> {code}
> One hackish solution would be that the admin investigates the logs, determine 
> that unclean-leader-election in this particular case would be safe and 
> temporarily enables it (while the crashed node is down) until new leaders are 
> selected for affected partitions, wait for the topics LEO advances far enough 
> and then brings up the crashed node again. This manual process is however 
> slow and error-prone and the cluster will suffer partial unavailability in 
> the meanwhile.
> We are thinking of having the controller make an exception for this case: if 
> ISR size is less than min-in-sync-replicas and the new leader would be -1, 
> then the controller does an RPC to all the replicas and inquire of the latest 
> offset, and if all the replicas responded then chose the one with the largest 
> offset as the leader as well as the new ISR. Note that the controller cannot 
> do that if any of the non-leader replicas do not respond since there might be 
> a case that the responding replicas have not been involved the last ISR and 
> hence potentially behind the others (and the controller could not know that 
> since it does not keep track of previous ISR).
> Pros would be that kafka will be safely available when such cases occur and 
> would not require any admin intervention. The cons however is that the 
> controller talking to brokers inside the leader election function would break 
> the existing pattern in the source code as currently the leader is elected 
> locally without requiring any additional RPC.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7026) Sticky assignor could assign a partition to multiple consumers (KIP-341)

2018-10-05 Thread Narayan Periwal (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639684#comment-16639684
 ] 

Narayan Periwal commented on KAFKA-7026:


Sure [~steven.aerts]. I have already created a ticket related to it - 
https://issues.apache.org/jira/browse/KAFKA-6681

> Sticky assignor could assign a partition to multiple consumers (KIP-341)
> 
>
> Key: KAFKA-7026
> URL: https://issues.apache.org/jira/browse/KAFKA-7026
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Reporter: Vahid Hashemian
>Assignee: Vahid Hashemian
>Priority: Major
>  Labels: kip
> Fix For: 2.2.0
>
>
> In the following scenario sticky assignor assigns a topic partition to two 
> consumers in the group:
>  # Create a topic {{test}} with a single partition
>  # Start consumer {{c1}} in group {{sticky-group}} ({{c1}} becomes group 
> leader and gets {{test-0}})
>  # Start consumer {{c2}}  in group {{sticky-group}} ({{c1}} holds onto 
> {{test-0}}, {{c2}} does not get any partition) 
>  # Pause {{c1}} (e.g. using Java debugger) ({{c2}} becomes leader and takes 
> over {{test-0}}, {{c1}} leaves the group)
>  # Resume {{c1}}
> At this point both {{c1}} and {{c2}} will have {{test-0}} assigned to them.
>  
> The reason is {{c1}} still has kept its previous assignment ({{test-0}}) from 
> the last assignment it received from the leader (itself) and did not get the 
> next round of assignments (when {{c2}} became leader) because it was paused. 
> Both {{c1}} and {{c2}} enter the rebalance supplying {{test-0}} as their 
> existing assignment. The sticky assignor code does not currently check and 
> avoid this duplication.
>  
> Note: This issue was originally reported on 
> [StackOverflow|https://stackoverflow.com/questions/50761842/kafka-stickyassignor-breaking-delivery-to-single-consumer-in-the-group].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7026) Sticky assignor could assign a partition to multiple consumers (KIP-341)

2018-10-05 Thread Steven Aerts (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639667#comment-16639667
 ] 

Steven Aerts commented on KAFKA-7026:
-

[~nperiwal] I propose that you create a separate ticket.

As this issue is only related to the sticky assignor.

 

> Sticky assignor could assign a partition to multiple consumers (KIP-341)
> 
>
> Key: KAFKA-7026
> URL: https://issues.apache.org/jira/browse/KAFKA-7026
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Reporter: Vahid Hashemian
>Assignee: Vahid Hashemian
>Priority: Major
>  Labels: kip
> Fix For: 2.2.0
>
>
> In the following scenario sticky assignor assigns a topic partition to two 
> consumers in the group:
>  # Create a topic {{test}} with a single partition
>  # Start consumer {{c1}} in group {{sticky-group}} ({{c1}} becomes group 
> leader and gets {{test-0}})
>  # Start consumer {{c2}}  in group {{sticky-group}} ({{c1}} holds onto 
> {{test-0}}, {{c2}} does not get any partition) 
>  # Pause {{c1}} (e.g. using Java debugger) ({{c2}} becomes leader and takes 
> over {{test-0}}, {{c1}} leaves the group)
>  # Resume {{c1}}
> At this point both {{c1}} and {{c2}} will have {{test-0}} assigned to them.
>  
> The reason is {{c1}} still has kept its previous assignment ({{test-0}}) from 
> the last assignment it received from the leader (itself) and did not get the 
> next round of assignments (when {{c2}} became leader) because it was paused. 
> Both {{c1}} and {{c2}} enter the rebalance supplying {{test-0}} as their 
> existing assignment. The sticky assignor code does not currently check and 
> avoid this duplication.
>  
> Note: This issue was originally reported on 
> [StackOverflow|https://stackoverflow.com/questions/50761842/kafka-stickyassignor-breaking-delivery-to-single-consumer-in-the-group].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7461) Connect Values converter should have coverage of logical types

2018-10-05 Thread Ritabrata Moitra (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639519#comment-16639519
 ] 

Ritabrata Moitra commented on KAFKA-7461:
-

[~ewencp] 

> Connect Values converter should have coverage of logical types
> --
>
> Key: KAFKA-7461
> URL: https://issues.apache.org/jira/browse/KAFKA-7461
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 1.1.1, 2.0.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
>Priority: Blocker
>  Labels: newbie, test
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Per fix from KAFKA-7460, we've got some gaps in testing for the Values 
> converter added in KIP-145, in particular for logical types. It looks like 
> there are a few other gaps (e.g. from quick scan of coverage, maybe the float 
> types as well), but logical types seem to be the bulk other than trivial 
> wrapper methods.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7461) Connect Values converter should have coverage of logical types

2018-10-05 Thread Ritabrata Moitra (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639517#comment-16639517
 ] 

Ritabrata Moitra commented on KAFKA-7461:
-

@Test
public void shouldConvertBoolean() {
 assertRoundTrip(Schema.BOOLEAN_SCHEMA, Schema.BOOLEAN_SCHEMA, true);
}

@Test
public void shouldConvertFromStringToBoolean() {
 assertRoundTrip(Schema.BOOLEAN_SCHEMA, Schema.STRING_SCHEMA, true);
}

@Test
public void shouldConvertFromNumberToBoolean() {
 assertRoundTrip(Schema.BOOLEAN_SCHEMA, Schema.INT64_SCHEMA, true);
}

Can someone help me understand if I am on the right track? Thanks! 

> Connect Values converter should have coverage of logical types
> --
>
> Key: KAFKA-7461
> URL: https://issues.apache.org/jira/browse/KAFKA-7461
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 1.1.1, 2.0.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
>Priority: Blocker
>  Labels: newbie, test
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Per fix from KAFKA-7460, we've got some gaps in testing for the Values 
> converter added in KIP-145, in particular for logical types. It looks like 
> there are a few other gaps (e.g. from quick scan of coverage, maybe the float 
> types as well), but logical types seem to be the bulk other than trivial 
> wrapper methods.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7274) Incorrect subject credential used in inter-broker communication

2018-10-05 Thread Rajini Sivaram (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639497#comment-16639497
 ] 

Rajini Sivaram commented on KAFKA-7274:
---

[~xiaotao183] For PLAIN and SCRAM, username/password are credentials used for 
establishing client connections. On the broker-side we need these options if we 
are using the mechanism for inter-broker. SCRAM doesn't have any server-side 
options, default PLAIN implementation has `user_xxx` options on the server-side 
providing the full set of credentials.

To use SCRAM as inter-broker mechanism, the configs would be:
{code}
listeners=SASL_PLAINTEXT://:9092
security.inter.broker.protocol=SASL_PLAINTEXT
sasl.enabled.mechanisms=PLAIN,SCRAM-SHA-256
sasl.mechanism.inter.broker.protocol=SCRAM-SHA-256
{code}

with the JAAS config:
{code}
sasl_plaintext.KafkaServer {
  org.apache.kafka.common.security.plain.PlainLoginModule required
  user_admin="admin-secret"
  user_alice="alice-secret";

  org.apache.kafka.common.security.scram.ScramLoginModule required
  username="admin1"
  password="admin-secret";
};
{code}

> Incorrect subject credential used in inter-broker communication
> ---
>
> Key: KAFKA-7274
> URL: https://issues.apache.org/jira/browse/KAFKA-7274
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 1.0.0, 1.0.1, 1.0.2, 1.1.0, 1.1.1, 2.0.0
>Reporter: TAO XIAO
>Priority: Major
>
> We configured one broker setup to enable multiple SASL mechanisms using JAAS 
> config file but we failed to start up the broker.
>  
> Here is security section of server.properties
>  
> {{listeners=SASL_PLAINTEXT://:9092
> security.inter.broker.protocol=SASL_PLAINTEXT
> sasl.enabled.mechanisms=PLAIN,SCRAM-SHA-256
> sasl.mechanism.inter.broker.protocol=PLAIN}}{{}}
>  
> JAAS file
>  
> {noformat}
> sasl_plaintext.KafkaServer {
>   org.apache.kafka.common.security.plain.PlainLoginModule required
>   username="admin"
>   password="admin-secret"
>   user_admin="admin-secret"
>   user_alice="alice-secret";
>   org.apache.kafka.common.security.scram.ScramLoginModule required
>   username="admin1"
>   password="admin-secret";
> };{noformat}
>  
> Exception we got
>  
> {noformat}
> [2018-08-10 12:12:13,070] ERROR [Controller id=0, targetBrokerId=0] 
> Connection to node 0 failed authentication due to: Authentication failed: 
> Invalid username or password 
> (org.apache.kafka.clients.NetworkClient){noformat}
>  
> If we changed to use broker configuration property we can start broker 
> successfully
>  
> {noformat}
> listeners=SASL_PLAINTEXT://:9092
> security.inter.broker.protocol=SASL_PLAINTEXT
> sasl.enabled.mechanisms=PLAIN,SCRAM-SHA-256
> sasl.mechanism.inter.broker.protocol=PLAIN
> listener.name.sasl_plaintext.plain.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule
>  required username="admin" password="admin-secret" user_admin="admin-secret" 
> user_alice="alice-secret";
> listener.name.sasl_plaintext.scram-sha-256.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule
>  required username="admin1" password="admin-secret";{noformat}
>  
> I believe this issue is caused by Kafka assigning all login modules to each 
> defined mechanism when using JAAS file which results in Login class to add 
> both username defined in each login module to the same subject
> [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/security/JaasContext.java#L101]
>  
> [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/security/authenticator/LoginManager.java#L63]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7461) Connect Values converter should have coverage of logical types

2018-10-05 Thread Ritabrata Moitra (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639498#comment-16639498
 ] 

Ritabrata Moitra commented on KAFKA-7461:
-

I would like to take a dab at this. 

It seems that coverage is missing for the following types (from scan coverage) 
- 
 # Boolean
 # Byte
 # Short ( Only tested via map )
 # Integer
 # Float
 # Double
 # Struct
 # Time
 # Date
 # Timestamp
 # Decimal 



Is there anything that I might have missed out? Also is there a priority order 
for these? 

> Connect Values converter should have coverage of logical types
> --
>
> Key: KAFKA-7461
> URL: https://issues.apache.org/jira/browse/KAFKA-7461
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 1.1.1, 2.0.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
>Priority: Blocker
>  Labels: newbie, test
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Per fix from KAFKA-7460, we've got some gaps in testing for the Values 
> converter added in KIP-145, in particular for logical types. It looks like 
> there are a few other gaps (e.g. from quick scan of coverage, maybe the float 
> types as well), but logical types seem to be the bulk other than trivial 
> wrapper methods.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7484) Fix test SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()

2018-10-05 Thread Dong Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639350#comment-16639350
 ] 

Dong Lin commented on KAFKA-7484:
-

The test is newly added in [https://github.com/apache/kafka/pull/5724]

[~vvcephei] would you have time to take a look? Thanks!

> Fix test 
> SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()
> 
>
> Key: KAFKA-7484
> URL: https://issues.apache.org/jira/browse/KAFKA-7484
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Priority: Major
>
> The test 
> SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown() fails 
> in the 2.1.0 branch Jekin job. See 
> [https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/.|https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/]
> Here is the stack trace: 
> java.lang.AssertionError: Condition not met within timeout 3. Did not 
> receive all 3 records from topic output-raw-shouldRecoverBufferAfterShutdown 
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:278) at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:462)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:343)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.verifyKeyValueTimestamps(IntegrationTestUtils.java:543)
>  at 
> org.apache.kafka.streams.integration.SuppressionDurabilityIntegrationTest.verifyOutput(SuppressionDurabilityIntegrationTest.java:239)
>  at 
> org.apache.kafka.streams.integration.SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown(SuppressionDurabilityIntegrationTest.java:206)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
> org.junit.runners.Suite.runChild(Suite.java:128) at 
> org.junit.runners.Suite.runChild(Suite.java:27) at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7484) Fix test SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()

2018-10-05 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-7484:
---

 Summary: Fix test 
SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown()
 Key: KAFKA-7484
 URL: https://issues.apache.org/jira/browse/KAFKA-7484
 Project: Kafka
  Issue Type: Improvement
Reporter: Dong Lin


The test 
SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown() fails 
in the 2.1.0 branch Jekin job. See 
[https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/.|https://builds.apache.org/job/kafka-2.1-jdk8/1/testReport/junit/org.apache.kafka.streams.integration/SuppressionDurabilityIntegrationTest/shouldRecoverBufferAfterShutdown_1__eosEnabled_true_/]

Here is the stack trace: 

java.lang.AssertionError: Condition not met within timeout 3. Did not 
receive all 3 records from topic output-raw-shouldRecoverBufferAfterShutdown at 
org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:278) at 
org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:462)
 at 
org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinRecordsReceived(IntegrationTestUtils.java:343)
 at 
org.apache.kafka.streams.integration.utils.IntegrationTestUtils.verifyKeyValueTimestamps(IntegrationTestUtils.java:543)
 at 
org.apache.kafka.streams.integration.SuppressionDurabilityIntegrationTest.verifyOutput(SuppressionDurabilityIntegrationTest.java:239)
 at 
org.apache.kafka.streams.integration.SuppressionDurabilityIntegrationTest.shouldRecoverBufferAfterShutdown(SuppressionDurabilityIntegrationTest.java:206)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
org.junit.runners.Suite.runChild(Suite.java:128) at 
org.junit.runners.Suite.runChild(Suite.java:27) at 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at 
org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)