[jira] [Updated] (KAFKA-4422) Drop support for Scala 2.10 (KIP-119)

2017-05-02 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-4422:
---
Summary: Drop support for Scala 2.10 (KIP-119)  (was: Drop support for 
Scala 2.10)

> Drop support for Scala 2.10 (KIP-119)
> -
>
> Key: KAFKA-4422
> URL: https://issues.apache.org/jira/browse/KAFKA-4422
> Project: Kafka
>  Issue Type: Task
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>  Labels: kip
> Fix For: 0.11.0.0
>
>
> Now that Scala 2.12 has been released, we should drop support for Scala 2.10  
> in the next major Kafka version so that we keep the number of supported 
> versions at 2. Since we have to compile and run the tests on each supported 
> version, there is a non-trivial cost from a development and testing 
> perspective.
> The clients library is in Java and we recommend people use the Java clients 
> instead of the Scala ones, so dropping support for Scala 2.10 should have a 
> smaller impact than it would have had in the past. Scala 2.10 was released in 
> January 2013 and support ended in March 2015. 
> Once we drop support for Scala 2.10, we can take advantage of APIs and 
> compiler improvements introduced in Scala 2.11 (introduced in April 2014): 
> http://scala-lang.org/news/2.11.0
> Link to KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-119%3A+Drop+Support+for+Scala+2.10+in+Kafka+0.11



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-02 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993933#comment-15993933
 ] 

Matthias J. Sax commented on KAFKA-5154:


Thanks for the details. We will investigate. If would help us, if you could 
apply Guozhang's patch and build Streams library by yourself. So next time this 
happens, we get some more information. Would this work for you?

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T18:04:25,993 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T18:04:29,401 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T18:05:10,877 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-05-01T00:01:55,707 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-05-01T00:01:59,027 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-05-01T00:01:59,031 ERROR StreamThread-1 
> org.apache.kafka.streams.processor.internals.StreamThread.run() @376 - 
> stream-thread [StreamThread-1] Streams application error during processing:
>  java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:619)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]
> 

[jira] [Comment Edited] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-02 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993902#comment-15993902
 ] 

Lukas Gemela edited comment on KAFKA-5154 at 5/2/17 10:56 PM:
--

Hi,

we have a kafka streams app consuming from one topic, aggregating data on 
another internal topic (by using in-memory logged store) and then outputting 
messages to out topic.  
Input topic has I believe 40 partitions and therefore internal topic has 40 
partitions as well.
We are running 8 parallel nodes of this app all the time.
We are using low level kafka streams API.
None of our code is blocking nor asynchronous.

I'm not really sure what caused the rebalance as all logs from all nodes are 
lost now, I've just seen couple of illegalStateExceptions on another nodes, 
coming from kafka streams stack, but I'm not even sure if they were thrown 
around the same time as NPE happened on this node.


___
Probably not important but might be related to the issue:
Please also note that in our exception handler we try to stop kafka streams app 
with our custom timeout and schedule a spring task to start it again. As far I 
can tell calling 

boolean isClosed = streams.close(10, TimeUnit.SECONDS)

never actually succeed and we always get isClosed = false.

Hope it helps!

Lukas


was (Author: lukas gemela):
Hi,

we have a kafka streams app consuming from one topic, aggregating data on 
another internal topic (by using in-memory logged store) and then outputting 
them to out topic.  
Input topic has I believe 40 partitions and therefore internal topic has 40 
partitions as well.
We are running 8 parallel nodes of this app all the time.
We are using low level kafka streams API.
None of our code is blocking nor asynchronous.

I'm not really sure what caused the rebalance as all logs from all nodes are 
lost now, I've just seen couple of illegalStateExceptions on another nodes, 
coming from kafka streams stack, but I'm not even sure if they were thrown 
around the same time as NPE happened on this node.


___
Probably not important but might be related to the issue:
Please also note that in our exception handler we try to stop kafka streams app 
with our custom timeout and schedule a spring task to start it again. As far I 
can tell calling 

boolean isClosed = streams.close(10, TimeUnit.SECONDS)

never actually succeed and we always get isClosed = false.

Hope it helps!

Lukas

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> 

[GitHub] kafka pull request #2926: KAFKA-5131: WriteTxnMarkers and complete commit/ab...

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2926


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5131) WriteTxnMarkers and complete commit/abort on partition immigration

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994087#comment-15994087
 ] 

ASF GitHub Bot commented on KAFKA-5131:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2926


> WriteTxnMarkers and complete commit/abort on partition immigration
> --
>
> Key: KAFKA-5131
> URL: https://issues.apache.org/jira/browse/KAFKA-5131
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Damian Guy
>Assignee: Damian Guy
>
> When partitions immigrate we need to write the txn markers and complete the 
> commit/abort for any transactions in a PrepareXX state



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5131) WriteTxnMarkers and complete commit/abort on partition immigration

2017-05-02 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5131:
---
   Resolution: Fixed
Fix Version/s: 0.11.0.0
   Status: Resolved  (was: Patch Available)

> WriteTxnMarkers and complete commit/abort on partition immigration
> --
>
> Key: KAFKA-5131
> URL: https://issues.apache.org/jira/browse/KAFKA-5131
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 0.11.0.0
>
>
> When partitions immigrate we need to write the txn markers and complete the 
> commit/abort for any transactions in a PrepareXX state



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-02 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993902#comment-15993902
 ] 

Lukas Gemela edited comment on KAFKA-5154 at 5/2/17 11:10 PM:
--

Hi,

we have a kafka streams app consuming from one topic, aggregating data on 
another internal topic (by using in-memory logged store) and then outputting 
messages to out topic.  
Input topic has I believe 40 partitions and therefore internal topic has 40 
partitions as well.
We are running 8 parallel nodes of this app all the time.
We are using low level kafka streams API.
None of our code is blocking nor asynchronous.

I'm not really sure what caused the rebalance as all logs from all nodes are 
lost now, I've just seen couple of illegalStateExceptions on another nodes, 
coming from kafka streams stack, but I'm not even sure if they were thrown 
around the same time as NPE happened on this node.


___
Probably not important but might be related to the issue:
Please also note that in our exception handler we try to stop kafka streams app 
with our custom timeout and schedule a spring task to start it again. As far I 
can tell calling 

boolean isClosed = streams.close(10, TimeUnit.SECONDS)

never actually succeed and we always get isClosed = false.

Hope it helps!

Lukas


was (Author: lukas gemela):
Hi,

we have a kafka streams app consuming from one topic, aggregating data on 
another internal topic (by using in-memory logged store) and then outputting 
messages to out topic.  
Input topic has I believe 40 partitions and therefore internal topic has 40 
partitions as well.
We are running 8 parallel nodes of this app all the time.
We are using low level kafka streams API.
None of our code is blocking nor asynchronous.

I'm not really sure what caused the rebalance as all logs from all nodes are 
lost now, I've just seen couple of illegalStateExceptions on another nodes, 
coming from kafka streams stack, but I'm not even sure if they were thrown 
around the same time as NPE happened on this node.


___
Probably not important but might be related to the issue:
Please also note that in our exception handler we try to stop kafka streams app 
with our custom timeout and schedule a spring task to start it again. As far I 
can tell calling 

boolean isClosed = streams.close(10, TimeUnit.SECONDS)

never actually succeed and we always get isClosed = false.

Hope it helps!

Lukas

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> 

[jira] [Updated] (KAFKA-4955) Add network handler thread utilization to request quota calculation

2017-05-02 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-4955:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

[~junrao] Yes, thank you. 

This has been completed in KAFKA-4954

> Add network handler thread utilization to request quota calculation
> ---
>
> Key: KAFKA-4955
> URL: https://issues.apache.org/jira/browse/KAFKA-4955
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0
>
>
> Add network thread utilization time to the metrics used for throttling based 
> on CPU utilization for requests. This time will be taken into account for the 
> throttling decision made when request handler thread time is recorded for the 
> subsequent request on the connection.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-1313) Support adding replicas to existing topic partitions via kafka-topics tool without manually setting broker assignments

2017-05-02 Thread Steven Schlansker (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993414#comment-15993414
 ] 

Steven Schlansker commented on KAFKA-1313:
--

I started using Kafka Streams recently.  I did not know to configure the 
{{replication.factor}} tuneable and so now all of my automatically generated 
topics have the wrong replication factor.  I tried to update via 
{{kafka-topics.sh}} and obviously ended up here.  I understand why this got 
deprioritized, but consider now that in addition to an administrator creating 
topics (where they have an opportunity to set replication factor right), Kafka 
Streams creates topics behind your back and you may not realize your 
replication factor is wrong until you have a lot of existing data.

I can obviously fix it up by hand as outlined above but this is a pretty big 
wart and seems that it should be well worth fixing.

> Support adding replicas to existing topic partitions via kafka-topics tool 
> without manually setting broker assignments
> --
>
> Key: KAFKA-1313
> URL: https://issues.apache.org/jira/browse/KAFKA-1313
> Project: Kafka
>  Issue Type: New Feature
>  Components: tools
>Affects Versions: 0.8.0
>Reporter: Marc Labbe
>Assignee: Sreepathi Prasanna
>  Labels: newbie++
>
> There is currently no easy way to add replicas to an existing topic 
> partitions.
> For example, topic create-test has been created with ReplicationFactor=1: 
> Topic:create-test  PartitionCount:3ReplicationFactor:1 Configs:
> Topic: create-test Partition: 0Leader: 1   Replicas: 1 Isr: 1
> Topic: create-test Partition: 1Leader: 2   Replicas: 2 Isr: 2
> Topic: create-test Partition: 2Leader: 3   Replicas: 3 Isr: 3
> I would like to increase the ReplicationFactor=2 (or more) so it shows up 
> like this instead.
> Topic:create-test  PartitionCount:3ReplicationFactor:2 Configs:
> Topic: create-test Partition: 0Leader: 1   Replicas: 1,2 Isr: 1,2
> Topic: create-test Partition: 1Leader: 2   Replicas: 2,3 Isr: 2,3
> Topic: create-test Partition: 2Leader: 3   Replicas: 3,1 Isr: 3,1
> Use cases for this:
> - adding brokers and thus increase fault tolerance
> - fixing human errors for topics created with wrong values



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-1194) The kafka broker cannot delete the old log files after the configured time

2017-05-02 Thread Michael Noll (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993467#comment-15993467
 ] 

Michael Noll commented on KAFKA-1194:
-

[~guozhang]: Do you happen to know the latest status of this ticket (Kafka 
issue on MS Windows)?

> The kafka broker cannot delete the old log files after the configured time
> --
>
> Key: KAFKA-1194
> URL: https://issues.apache.org/jira/browse/KAFKA-1194
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.8.1
> Environment: window
>Reporter: Tao Qin
>Assignee: Jay Kreps
>Priority: Critical
>  Labels: features, patch
> Attachments: KAFKA-1194.patch, kafka-1194-v1.patch, 
> kafka-1194-v2.patch, screenshot-1.png, Untitled.jpg
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> We tested it in windows environment, and set the log.retention.hours to 24 
> hours.
> # The minimum age of a log file to be eligible for deletion
> log.retention.hours=24
> After several days, the kafka broker still cannot delete the old log file. 
> And we get the following exceptions:
> [2013-12-19 01:57:38,528] ERROR Uncaught exception in scheduled task 
> 'kafka-log-retention' (kafka.utils.KafkaScheduler)
> kafka.common.KafkaStorageException: Failed to change the log file suffix from 
>  to .deleted for log segment 1516723
>  at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:249)
>  at kafka.log.Log.kafka$log$Log$$asyncDeleteSegment(Log.scala:638)
>  at kafka.log.Log.kafka$log$Log$$deleteSegment(Log.scala:629)
>  at kafka.log.Log$$anonfun$deleteOldSegments$1.apply(Log.scala:418)
>  at kafka.log.Log$$anonfun$deleteOldSegments$1.apply(Log.scala:418)
>  at 
> scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59)
>  at scala.collection.immutable.List.foreach(List.scala:76)
>  at kafka.log.Log.deleteOldSegments(Log.scala:418)
>  at 
> kafka.log.LogManager.kafka$log$LogManager$$cleanupExpiredSegments(LogManager.scala:284)
>  at 
> kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:316)
>  at 
> kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:314)
>  at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:743)
>  at scala.collection.Iterator$class.foreach(Iterator.scala:772)
>  at 
> scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:573)
>  at scala.collection.IterableLike$class.foreach(IterableLike.scala:73)
>  at 
> scala.collection.JavaConversions$JListWrapper.foreach(JavaConversions.scala:615)
>  at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:742)
>  at kafka.log.LogManager.cleanupLogs(LogManager.scala:314)
>  at 
> kafka.log.LogManager$$anonfun$startup$1.apply$mcV$sp(LogManager.scala:143)
>  at kafka.utils.KafkaScheduler$$anon$1.run(KafkaScheduler.scala:100)
>  at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  at java.lang.Thread.run(Thread.java:724)
> I think this error happens because kafka tries to rename the log file when it 
> is still opened.  So we should close the file first before rename.
> The index file uses a special data structure, the MappedByteBuffer. Javadoc 
> describes it as:
> A mapped byte buffer and the file mapping that it represents remain valid 
> until the buffer itself is garbage-collected.
> Fortunately, I find a forceUnmap function in kafka code, and perhaps it can 
> be used to free the MappedByteBuffer.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5161) reassign-partitions to check if broker of ID exists in cluster

2017-05-02 Thread Lawrence Weikum (JIRA)
Lawrence Weikum created KAFKA-5161:
--

 Summary: reassign-partitions to check if broker of ID exists in 
cluster
 Key: KAFKA-5161
 URL: https://issues.apache.org/jira/browse/KAFKA-5161
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.10.1.1
 Environment: Debian 8
Reporter: Lawrence Weikum
Priority: Minor


A topic was created with only one replica. We wanted to increase it later to 3 
replicas. A JSON file was created, but the IDs for the brokers were incorrect 
and not part of the system. 

The script or the brokers receiving the reassignment command should first check 
if the new IDs exist in the cluster first and then continue, throwing an error 
to the user if there is one that doesn't.


The current effect of assign partitions to non-existant brokers is a stuck 
replication assignment with no way to stop it. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-3266) Implement KIP-140 RPCs and APIs for creating, altering, and listing ACLs

2017-05-02 Thread Colin P. McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin P. McCabe updated KAFKA-3266:
---
Summary: Implement KIP-140 RPCs and APIs for creating, altering, and 
listing ACLs  (was: Implement KIP-4 RPCs and APIs for creating, altering, and 
listing ACLs)

> Implement KIP-140 RPCs and APIs for creating, altering, and listing ACLs
> 
>
> Key: KAFKA-3266
> URL: https://issues.apache.org/jira/browse/KAFKA-3266
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Grant Henke
>Assignee: Colin P. McCabe
> Fix For: 0.11.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-145: Classloading Isolation in Connect

2017-05-02 Thread Konstantine Karantasis
Thank you for your vote Colin!

Unfortunately the discussion thread had to move because of a KIP-number
collision. Here's a link to the active discussion thread:

https://www.mail-archive.com/dev@kafka.apache.org/msg71453.html

Apologies for the inconvenience.

Konstantine

On Mon, May 1, 2017 at 10:20 AM, Colin McCabe  wrote:

> +1 (non-binding)
>
> This would be a nice improvement.
>
> C.
>
> On Fri, Apr 28, 2017, at 11:03, Konstantine Karantasis wrote:
> > Hi everyone,
> >
> > we aim to address dependency conflicts in Kafka Connect soon by applying
> > class loading isolation.
> >
> > Feel free to take a look at KIP-145 here:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 145+Classloading+Isolation+in+Connect
> >
> >
> > which describes minimal required changes in the public interfaces and the
> > general implementation approach.
> >
> > This is a much wanted feature for Kafka Connect. Your feedback is highly
> > appreciated.
> >
> > -Konstantine
>


Re: [DISCUSS] KIP-146: Classloading Isolation in Connect

2017-05-02 Thread Ewen Cheslack-Postava
Thanks for the KIP.

A few responses inline, followed by additional comments.

On Mon, May 1, 2017 at 9:50 PM, Konstantine Karantasis <
konstant...@confluent.io> wrote:

> Gwen, Randall thank you for your very insightful observations. I'm glad you
> find this first draft to be an adequate platform for discussion.
>
> I'll attempt replying to your comments in order.
>
> Gwen, I also debated exactly the same two options: a) interpreting absence
> of module path as a user's intention to turn off isolation and b)
> explicitly using an additional boolean property. A few reasons why I went
> with b) in this first draft are:
> 1) As Randall mentions, to leave the option of using a default value open.
> If not immediately in the first version of isolation, maybe in the future.
> 2) I didn't like the implicit character of the choice of interpreting an
> empty string as a clear intention to turn isolation off by the user. Half
> the time could be just that users forget to set a location, although they'd
> like to use class loading isolation.
> 3) There's a slim possibility that in rare occasions a user might want to
> avoid even the slightest increase in memory consumption due to class
> loading duplication. I admit this should be very rare, but given the other
> concerns and that we would really like to keep the isolation implementation
> simple, the option to turn off this feature by using only one additional
> config property might not seem too excessive. At least at the start of this
> discussion.
> 4) Debugging during development might be simpler in some cases.
> 5) Finally, as you mention, this could allow for smoother upgrades.
>

I'm not sure any of these keep you from removing the extra config. Is there
any reason you couldn't have clean support for relying on the CLASSPATH
while still supporting the classloaders? Then getting people onto the new
classloaders does require documentation for how to install connectors, but
that's pretty minimal. And we don't break existing installations where
people are just adding to the CLASSPATH. It seems like this:

1. Allows you to set a default. Isolation is always enabled, but we won't
include any paths/directories we already use. Setting a default just
requires specifying a new location where we'd hold these directories.
2. It doesn't require the implicit choice -- you actually never turn off
isolation, but still support the regular CLASSPATH with an empty list of
isolated loaders
3. The user can still use CLASSPATH if they want to minimize classloader
overhead
4. Debugging can still use CLASSPATH
5. Upgrades just work.


>
> Randall, regarding your comments:
> 1) To keep its focus narrow, this KIP, as well as the first implementation
> of isolation in Connect, assume filesystem based discovery. With careful
> implementation, transitioning to discovery schemes that support broader
> URIs I believe should be easy in the future.
>

Maybe just mention a couple of quick examples in the KIP. When described
inline it might be more obvious that it will extend cleanly.


> 2) The example you give makes a good point. However I'm inclined to say
> that such cases should be addressed more as exceptions rather than as being
> the common case. Therefore, I wouldn't see all dependencies imported by the
> framework as required to be filtered out, because in that case we lose the
> advantage of isolation between the framework and the connectors (and we are
> left only with isolation between connectors).

3) I tried to abstract implementation details in this the KIP, but you are
> right. Even though filtering here is mainly used semantically rather than
> literally, it gives an implementation hint that we could avoid.
>

I think we're missing another option -- don't do filtering and require that
those dependencies are correctly filtered out of the modules. If we want to
be nicer about this, we could also detect maybe 2 or 3 classes while
scanning for Connectors/Converters/Transformations that indicate the
classloader has jars that it shouldn't and warn about it. I can't think of
that many that would be an issue -- basically connect-api, connect-runtime
if they really mess it up, and maybe slf4j.


> 4) In the same spirit as in 3) I believe we should reserve enough
> flexibility to the implementation to discover and load classes, when they
> appear in multiple locations under the general module location.
>
>
And a couple of addition comments:

- module.path should be module.paths if it accepts multiple paths
- I think the description of the module paths is a bit confusing because I
think for simpler configuration we actually want to specify the *parent*
directory of module paths. I definitely prefer this since it's simpler
although I am not certain how it will mix with future extensions to other
URL handlers (where a zip file or http URL wouldn't do the same discovery
within subdirectories)

-Ewen


> Thanks again! Let me know what you think.
> Konstantine
>
>
> On Mon, May 1, 2017 at 

[jira] [Commented] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-02 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993594#comment-15993594
 ] 

Guozhang Wang commented on KAFKA-5154:
--

I think I saw a similar issue a days ago and was trying to dig more info with 
this PR but cannot reproduce it since then, just FYI: 
https://github.com/apache/kafka/pull/2928

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T18:04:25,993 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T18:04:29,401 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T18:05:10,877 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-05-01T00:01:55,707 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-05-01T00:01:59,027 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-05-01T00:01:59,031 ERROR StreamThread-1 
> org.apache.kafka.streams.processor.internals.StreamThread.run() @376 - 
> stream-thread [StreamThread-1] Streams application error during processing:
>  java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:619)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]
> 2017-05-01T00:02:00,038 INFO  StreamThread-1 
> 

[jira] [Created] (KAFKA-5160) KIP-98 : broker side handling for the TxnOffsetCommitRequest

2017-05-02 Thread Apurva Mehta (JIRA)
Apurva Mehta created KAFKA-5160:
---

 Summary: KIP-98 : broker side handling for the 
TxnOffsetCommitRequest
 Key: KAFKA-5160
 URL: https://issues.apache.org/jira/browse/KAFKA-5160
 Project: Kafka
  Issue Type: Task
Reporter: Apurva Mehta
Assignee: Apurva Mehta






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-136: Add Listener name and Security Protocol name to SelectorMetrics tags

2017-05-02 Thread Ismael Juma
Edoardo,

Are you planning to do this only in the broker? Listener names only exist
at the broker level. Also, clients only have a single security protocol, so
maybe we should do this for brokers only. In that case, the compatibility
impact is also lower because more users rely on the Yammer metrics instead.

I would suggest you start the voting thread after clarifying if this change
only applies at the broker level.

Ismael

On Tue, Apr 25, 2017 at 1:19 PM, Ismael Juma  wrote:

> Thanks for the KIP. I think it makes sense to have those tags. My only
> question is regarding the compatibility impact. We don't have a good
> compatibility story when it comes to adding tags to existing metrics since
> the JmxReporter adds the tags to the object name.
>
> Ismael
>
> On Thu, Mar 30, 2017 at 4:51 PM, Edoardo Comar  wrote:
>
>> Hi all,
>>
>> We created KIP-136: Add Listener name and Security Protocol name to
>> SelectorMetrics tags
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-136%
>> 3A+Add+Listener+name+and+Security+Protocol+name+to+SelectorMetrics+tags
>>
>> Please help review the KIP. You feedback is appreciated!
>>
>> cheers,
>> Edo
>> --
>> Edoardo Comar
>> IBM MessageHub
>> eco...@uk.ibm.com
>> IBM UK Ltd, Hursley Park, SO21 2JN
>>
>> IBM United Kingdom Limited Registered in England and Wales with number
>> 741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6
>> 3AU
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with number
>> 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>>
>
>


Jenkins build is back to normal : kafka-trunk-jdk7 #2138

2017-05-02 Thread Apache Jenkins Server
See 




[GitHub] kafka pull request #2958: KAFKA-5162: Add a reference to AdminClient to docs...

2017-05-02 Thread cmccabe
GitHub user cmccabe opened a pull request:

https://github.com/apache/kafka/pull/2958

KAFKA-5162: Add a reference to AdminClient to docs/api.html



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cmccabe/kafka KAFKA-5162

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2958.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2958


commit b203338524f09de08458cc4102c7087b445ffd3b
Author: Colin P. Mccabe 
Date:   2017-05-02T19:26:19Z

KAFKA-5162: Add a reference to AdminClient to docs/api.html




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5162) Add a reference to AdminClient to docs/api.html

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993565#comment-15993565
 ] 

ASF GitHub Bot commented on KAFKA-5162:
---

GitHub user cmccabe opened a pull request:

https://github.com/apache/kafka/pull/2958

KAFKA-5162: Add a reference to AdminClient to docs/api.html



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cmccabe/kafka KAFKA-5162

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2958.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2958


commit b203338524f09de08458cc4102c7087b445ffd3b
Author: Colin P. Mccabe 
Date:   2017-05-02T19:26:19Z

KAFKA-5162: Add a reference to AdminClient to docs/api.html




> Add a reference to AdminClient to docs/api.html
> ---
>
> Key: KAFKA-5162
> URL: https://issues.apache.org/jira/browse/KAFKA-5162
> Project: Kafka
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 0.11.0.0
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
> Fix For: 0.11.0.0
>
>
> Add a reference to AdminClient to docs/api.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5162) Add a reference to AdminClient to docs/api.html

2017-05-02 Thread Colin P. McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993566#comment-15993566
 ] 

Colin P. McCabe commented on KAFKA-5162:


https://github.com/apache/kafka/pull/2958

> Add a reference to AdminClient to docs/api.html
> ---
>
> Key: KAFKA-5162
> URL: https://issues.apache.org/jira/browse/KAFKA-5162
> Project: Kafka
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 0.11.0.0
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
> Fix For: 0.11.0.0
>
>
> Add a reference to AdminClient to docs/api.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Work started] (KAFKA-5162) Add a reference to AdminClient to docs/api.html

2017-05-02 Thread Colin P. McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-5162 started by Colin P. McCabe.
--
> Add a reference to AdminClient to docs/api.html
> ---
>
> Key: KAFKA-5162
> URL: https://issues.apache.org/jira/browse/KAFKA-5162
> Project: Kafka
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 0.11.0.0
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
> Fix For: 0.11.0.0
>
>
> Add a reference to AdminClient to docs/api.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5162) Add a reference to AdminClient to docs/api.html

2017-05-02 Thread Colin P. McCabe (JIRA)
Colin P. McCabe created KAFKA-5162:
--

 Summary: Add a reference to AdminClient to docs/api.html
 Key: KAFKA-5162
 URL: https://issues.apache.org/jira/browse/KAFKA-5162
 Project: Kafka
  Issue Type: Sub-task
  Components: documentation
Affects Versions: 0.11.0.0
Reporter: Colin P. McCabe
Assignee: Colin P. McCabe


Add a reference to AdminClient to docs/api.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP 145 - Expose Record Headers in Kafka Connect

2017-05-02 Thread Ewen Cheslack-Postava
A couple of thoughts:

First, agreed that we definitely want to expose header functionality. Thank
you Mike for starting the conversation! Even if Connect doesn't do anything
special with it, there's value in being able to access/set headers.

On motivation -- I think there are much broader use cases. When thinking
about exposing headers, I'd actually use Replicator as only a minor
supporting case. The reason is that it is a very uncommon case where there
is zero impedance mismatch between the source and sink of the data since
they are both Kafka. This means you don't need to think much about data
formats/serialization. I think the JMS use case is a better example since
JMS headers and Kafka headers don't quite match up. Here's a quick list of
use cases I can think of off the top of my head:

1. Include headers from other systems that support them: JMS (or really any
MQ), HTTP
2. Other connector-specific headers. For example, from JDBC maybe the table
the data comes from is a header; for a CDC connector you might include the
binlog offset as a header.
3. Interceptor/SMT-style use cases for annotating things like provenance of
data:
3a. Generically w/ user-supplied data like data center, host, app ID, etc.
3b. Kafka Connect framework level info, such as the connector/task
generating the data

On deviation from Connect's model -- to be honest, the KIP-82 also deviates
quite substantially from how Kafka handles data already, so we may struggle
a bit to rectify the two. (In particular, headers specify some structure
and enforce strings specifically for header keys, but then require you to
do serialization of header values yourself...).

I think the use cases I mentioned above may also need different approaches
to how the data in headers are handled. As Gwen mentions, if we expose the
headers to Connectors, they need to have some idea of the format and the
reason for byte[] values in KIP-82 is to leave that decision up to the
organization using them. But without knowing the format, connectors can't
really do anything with them -- if a source connector assumes a format,
they may generate data incompatible with the format used by the rest of the
organization. On the other hand, I have a feeling most people will just use
 headers, so allowing connectors to embed arbitrarily
complex data may not work out well in practice. Or maybe we leave it
flexible, most people default to using StringConverter for the serializer
and Connectors will end up defaulting to that just for compatibility...

I'm not sure I have a real proposal yet, but I do think understanding the
impact of using a Converter for headers would be useful, and we might want
to think about how this KIP would fit in with transformations (or if that
is something that can be deferred, handled separately from the existing
transformations, etc).

-Ewen

On Mon, May 1, 2017 at 11:52 AM, Michael Pearce 
wrote:

> Hi Gwen,
>
> Then intent here was to allow tools that perform similar role to mirror
> makers of replicating the messaging from one cluster to another.  Eg like
> mirror make should just be taking and transferring the headers as is.
>
> We don't actually use this inside our company, so not exposing this isn't
> an issue for us. Just believe there are companies like confluent who have
> tools like replicator that do.
>
> And as good citizens think we should complete the work and expose the
> headers same as in the record to at least allow them to replicate the
> messages as is. Note Steph seems to want it.
>
> Cheers
> Mike
>
> Sent using OWA for iPhone
> 
> From: Gwen Shapira 
> Sent: Monday, May 1, 2017 2:36:34 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP 145 - Expose Record Headers in Kafka Connect
>
> Hi,
>
> I'm excited to see the community expanding Connect in this direction!
> Headers + Transforms == Fun message routing.
>
> I like how clean the proposal is, but I'm concerned that it kinda deviates
> from how Connect handles data elsewhere.
> Unlike Kafka, Connect doesn't look at all data as byte-arrays, we have
> converters that take data in specific formats (JSON, Avro) and turns it
> into Connect data types (defined in the data api). I think it will be more
> consistent for connector developers to also get headers as some kind of
> structured or semi-structured data (and to expand the converters to handle
> header conversions as well).
> This will allow for Connect's separation of concerns - Connector developers
> don't worry about data formats (because they get the internal connect
> objects) and Converters do all the data format work.
>
> Another thing, in my experience, APIs work better if they are put into use
> almost immediately - so difficulties in using the APIs are immediately
> surfaced. Are you planning any connectors that will use this feature (not
> necessarily in Kafka, just in general)? Or perhaps we can think of a way 

Build failed in Jenkins: kafka-trunk-jdk8 #1471

2017-05-02 Thread Apache Jenkins Server
See 


Changes:

[junrao] KAFKA-5107; remove preferred replica election state from

--
[...truncated 842.45 KB...]
kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithCollision PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.PrimitiveApiTest > testMultiProduce STARTED

kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch STARTED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize 
STARTED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests STARTED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch STARTED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression STARTED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic STARTED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest STARTED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete STARTED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
STARTED


[jira] [Commented] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-02 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993408#comment-15993408
 ] 

Matthias J. Sax commented on KAFKA-5154:


Thanks for reporting this [~Lukas Gemela]. I just double checked the code: 
https://github.com/apache/kafka/blob/0.10.2.0/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java#L618-L619

A {{NullPointerException}} can only occur if {{task}} is {{null}} (if 
{{records}} would be {{null}} this would have happened in line 617 already). 
Thus, it seems, that there is no active task for a record. Not sure how this 
could happen atm. What triggers your rebalance? How many topics do you process 
and what it the number of partitions per topic?

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T18:04:25,993 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T18:04:29,401 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T18:05:10,877 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-05-01T00:01:55,707 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-05-01T00:01:59,027 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-05-01T00:01:59,031 ERROR StreamThread-1 
> org.apache.kafka.streams.processor.internals.StreamThread.run() @376 - 
> stream-thread [StreamThread-1] Streams application 

[jira] [Updated] (KAFKA-5153) KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting

2017-05-02 Thread Arpan (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpan updated KAFKA-5153:
-
Affects Version/s: 0.10.2.0

> KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting
> ---
>
> Key: KAFKA-5153
> URL: https://issues.apache.org/jira/browse/KAFKA-5153
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
> Environment: RHEL 6
> Java Version  1.8.0_91-b14
>Reporter: Arpan
>Priority: Critical
> Attachments: server_1_72server.log, server_2_73_server.log, 
> server_3_74Server.log, server.properties, ThreadDump_1493564142.dump, 
> ThreadDump_1493564177.dump, ThreadDump_1493564249.dump
>
>
> Hi Team, 
> I was earlier referring to issue KAFKA-4477 because the problem i am facing 
> is similar. I tried to search the same reference in release docs as well but 
> did not get anything in 0.10.1.1 or 0.10.2.0. I am currently using 
> 2.11_0.10.2.0.
> I am have 3 node cluster for KAFKA and cluster for ZK as well on the same set 
> of servers in cluster mode. We are having around 240GB of data getting 
> transferred through KAFKA everyday. What we are observing is disconnect of 
> the server from cluster and ISR getting reduced and it starts impacting 
> service.
> I have also observed file descriptor count getting increased a bit, in normal 
> circumstances we have not observed FD count more than 500 but when issue 
> started we were observing it in the range of 650-700 on all 3 servers. 
> Attaching thread dumps of all 3 servers when we started facing the issue 
> recently.
> The issue get vanished once you bounce the nodes and the set up is not 
> working more than 5 days without this issue. Attaching server logs as well.
> Kindly let me know if you need any additional information. Attaching 
> server.properties as well for one of the server (It's similar on all 3 
> serversP)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP 141 - ProducerRecordBuilder Interface

2017-05-02 Thread Ismael Juma
Hi Matthias,

Deprecating widely used APIs is a big deal. Build warnings are a nuisance
and can potentially break the build for those who have a zero-warnings
policy (which is good practice). It creates a bunch of busy work for our
users and various resources like books, blog posts, etc. become out of date.

This does not mean that we should not do it, but the benefit has to be
worth it and we should not do it lightly.

Ismael

On Sat, Apr 29, 2017 at 6:52 PM, Matthias J. Sax 
wrote:

> I understand that we cannot just break stuff (btw: also not for
> Streams!). But deprecating does not break anything, so I don't think
> it's a big deal to change the API as long as we keep the old API as
> deprecated.
>
>
> -Matthias
>
> On 4/29/17 9:28 AM, Jay Kreps wrote:
> > Hey Matthias,
> >
> > Yeah I agree, I'm not against change as a general thing! I also think if
> > you look back on the last two years, we completely rewrote the producer
> and
> > consumer APIs, reworked the binary protocol many times over, and added
> the
> > connector and stream processing apis, both major new additions. So I
> don't
> > think we're in too much danger of stagnating!
> >
> > My two cents was just around breaking compatibility for trivial changes
> > like constructor => builder. I think this only applies to the producer,
> > consumer, and connect apis which are heavily embedded in hundreds of
> > ecosystem components that depend on them. This is different from direct
> > usage. If we break the streams api it is really no big deal---apps just
> > need to rebuild when they upgrade, not the end of the world at all.
> However
> > because many intermediate things depend on the Kafka producer you can
> cause
> > these weird situations where your app depends on two third party things
> > that use Kafka and each requires different, incompatible versions. We did
> > this a lot in earlier versions of Kafka and it was the cause of much
> angst
> > (and an ingrained general reluctance to upgrade) from our users.
> >
> > I still think we may have to break things, i just don't think we should
> do
> > it for things like builders vs direct constructors which i think are kind
> > of a debatable matter of taste.
> >
> > -Jay
> >
> >
> >
> > On Mon, Apr 24, 2017 at 9:40 AM, Matthias J. Sax 
> > wrote:
> >
> >> Hey Jay,
> >>
> >> I understand your concern, and for sure, we will need to keep the
> >> current constructors deprecated for a long time (ie, many years).
> >>
> >> But if we don't make the move, we will not be able to improve. And I
> >> think warnings about using deprecated APIs is an acceptable price to
> >> pay. And the API improvements will help new people who adopt Kafka to
> >> get started more easily.
> >>
> >> Otherwise Kafka might end up as many other enterprise software with a
> >> lots of old stuff that is kept forever because nobody has the guts to
> >> improve/change it.
> >>
> >> Of course, we can still improve the docs of the deprecated constructors,
> >> too.
> >>
> >> Just my two cents.
> >>
> >>
> >> -Matthias
> >>
> >> On 4/23/17 3:37 PM, Jay Kreps wrote:
> >>> Hey guys,
> >>>
> >>> I definitely think that the constructors could have been better
> designed,
> >>> but I think given that they're in heavy use I don't think this proposal
> >>> will improve things. Deprecating constructors just leaves everyone with
> >>> lots of warnings and crossed out things. We can't actually delete the
> >>> methods because lots of code needs to be usable across multiple Kafka
> >>> versions, right? So we aren't picking between the original approach
> >> (worse)
> >>> and the new approach (better); what we are proposing is a perpetual
> >>> mingling of the original style and the new style with a bunch of
> >> deprecated
> >>> stuff, which I think is worst of all.
> >>>
> >>> I'd vote for just documenting the meaning of null in the ProducerRecord
> >>> constructor.
> >>>
> >>> -Jay
> >>>
> >>> On Wed, Apr 19, 2017 at 3:34 PM, Stephane Maarek <
> >>> steph...@simplemachines.com.au> wrote:
> >>>
>  Hi all,
> 
>  My first KIP, let me know your thoughts!
>  https://cwiki.apache.org/confluence/display/KAFKA/KIP+
>  141+-+ProducerRecordBuilder+Interface
> 
> 
>  Cheers,
>  Stephane
> 
> >>>
> >>
> >>
> >
>
>


[GitHub] kafka pull request #2952: MINOR: fix some c/p error that was causing describ...

2017-05-02 Thread norwood
GitHub user norwood opened a pull request:

https://github.com/apache/kafka/pull/2952

MINOR: fix some c/p error that was causing describe to delete

@ijuma @cmccabe 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/norwood/kafka describe-not-delete

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2952.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2952


commit 916dbb1eac9f3d89cb1f94302bc8fb911f3eba72
Author: dan norwood 
Date:   2017-05-02T06:43:04Z

fix some c/p error that was causing describe to delete




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #2953: handle 0 futures in all()

2017-05-02 Thread norwood
GitHub user norwood opened a pull request:

https://github.com/apache/kafka/pull/2953

handle 0 futures in all()

if we pass in 0 futures to an AllOfAdapter, we should complete immediately

@ijuma @cmccabe 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/norwood/kafka handle-all-of-0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2953.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2953


commit 5ca7e69ec75569c44226bb8d9863d7cd88714523
Author: dan norwood 
Date:   2017-05-02T06:44:25Z

handle 0 futures in all()

if we pass in 0 futures to an AllOfAdapter, we should complete immediately




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-02 Thread Lukas Gemela (JIRA)
Lukas Gemela created KAFKA-5154:
---

 Summary: Kafka Streams throws NPE during rebalance
 Key: KAFKA-5154
 URL: https://issues.apache.org/jira/browse/KAFKA-5154
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.10.2.0
Reporter: Lukas Gemela


please see attached log, Kafka streams throws NullPointerException during 
rebalance, which is caught by our custom exception handler

{noformat}
2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
 @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
dead for group hades
2017-04-30T17:44:27,395 INFO  StreamThread-1 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
@573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
for group hades.
2017-04-30T17:44:27,941 INFO  StreamThread-1 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare() 
@393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] for 
group hades
2017-04-30T17:44:27,947 INFO  StreamThread-1 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
 @407 - (Re-)joining group hades
2017-04-30T17:44:48,468 INFO  StreamThread-1 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
 @407 - (Re-)joining group hades
2017-04-30T17:44:53,628 INFO  StreamThread-1 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
 @407 - (Re-)joining group hades
2017-04-30T17:45:09,587 INFO  StreamThread-1 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
 @407 - (Re-)joining group hades
2017-04-30T17:45:11,961 INFO  StreamThread-1 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
@375 - Successfully joined group hades with generation 99
2017-04-30T17:45:13,126 INFO  StreamThread-1 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
 @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
 @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
dead for group hades
2017-04-30T18:04:25,993 INFO  StreamThread-1 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
@573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
for group hades.
2017-04-30T18:04:29,401 INFO  StreamThread-1 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare() 
@393 - Revoking previously assigned partitions [poseidonIncidentFeed-11, 
poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
2017-04-30T18:05:10,877 INFO  StreamThread-1 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
 @407 - (Re-)joining group hades
2017-05-01T00:01:55,707 INFO  StreamThread-1 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
 @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
dead for group hades
2017-05-01T00:01:59,027 INFO  StreamThread-1 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
@573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
for group hades.
2017-05-01T00:01:59,031 ERROR StreamThread-1 
org.apache.kafka.streams.processor.internals.StreamThread.run() @376 - 
stream-thread [StreamThread-1] Streams application error during processing:
 java.lang.NullPointerException
at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:619)
 ~[kafka-streams-0.10.2.0.jar!/:?]
at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
 [kafka-streams-0.10.2.0.jar!/:?]
2017-05-01T00:02:00,038 INFO  StreamThread-1 
org.apache.kafka.clients.producer.KafkaProducer.close() @689 - Closing the 
Kafka producer with timeoutMillis = 9223372036854775807 ms.
2017-05-01T00:02:00,949 WARN  StreamThread-1 
org.apache.kafka.streams.processor.internals.StreamThread.setState() @160 - 
Unexpected state transition from PARTITIONS_REVOKED to NOT_RUNNING
2017-05-01T00:02:00,951 ERROR StreamThread-1 
com.williamhill.trading.platform.hades.kafka.KafkaStreamManager.uncaughtException()
 @104 - UncaughtException in thread StreamThread-1, stopping kafka streams
 

[jira] [Created] (KAFKA-5153) KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting

2017-05-02 Thread Arpan (JIRA)
Arpan created KAFKA-5153:


 Summary: KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : 
Service Impacting
 Key: KAFKA-5153
 URL: https://issues.apache.org/jira/browse/KAFKA-5153
 Project: Kafka
  Issue Type: Bug
 Environment: RHEL 6
Java Version  1.8.0_91-b14
Reporter: Arpan
Priority: Critical
 Attachments: server_1_72server.log, server_2_73_server.log, 
server_3_74Server.log, server.properties, ThreadDump_1493564142.dump, 
ThreadDump_1493564177.dump, ThreadDump_1493564249.dump

Hi Team, 

I was earlier referring to issue KAFKA-4477 because the problem i am facing is 
similar. I tried to search the same reference in release docs as well but did 
not get anything in 0.10.1.1 or 0.10.2.0. I am currently using 2.11_0.10.2.0.

I am have 3 node cluster for KAFKA and cluster for ZK as well on the same set 
of servers in cluster mode. We are having around 240GB of data getting 
transferred through KAFKA everyday. What we are observing is disconnect of the 
server from cluster and ISR getting reduced and it starts impacting service.

I have also observed file descriptor count getting increased a bit, in normal 
circumstances we have not observed FD count more than 500 but when issue 
started we were observing it in the range of 650-700 on all 3 servers. 
Attaching thread dumps of all 3 servers when we started facing the issue 
recently.

The issue get vanished once you bounce the nodes and the set up is not working 
more than 5 days without this issue. Attaching server logs as well.

Kindly let me know if you need any additional information. Attaching 
server.properties as well for one of the server (It's similar on all 3 serversP)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP 145 - Expose Record Headers in Kafka Connect

2017-05-02 Thread Michael André Pearce
Hi Ewan,

So on the point of JMS the predefined/standardised JMS and JMSX headers have 
predefined types. So these can be serialised/deserialised accordingly.

Custom jms headers agreed could be a bit more difficult but on the 80/20 rule I 
would agree mostly they're string values and as anyhow you can hold bytes as a 
string it wouldn't cause any issue, defaulting to that.

But I think easily we maybe able to do one better.

Obviously can override the/config the headers converter but we can supply a 
default converter could take a config file with key to type mapping?

Allowing people to maybe define/declare a header key with the expected type in 
some property file? To support string, byte[] and primitives? And undefined 
headers just either default to String or byte[]

We could also pre define known headers like the jms ones mentioned above.

E.g

AwesomeHeader1=boolean 
AwesomeHeader2=long
JMSCorrelationId=String
JMSXGroupId=String


What you think?


Cheers
Mike






Sent from my iPhone

> On 2 May 2017, at 18:45, Ewen Cheslack-Postava  wrote:
> 
> A couple of thoughts:
> 
> First, agreed that we definitely want to expose header functionality. Thank
> you Mike for starting the conversation! Even if Connect doesn't do anything
> special with it, there's value in being able to access/set headers.
> 
> On motivation -- I think there are much broader use cases. When thinking
> about exposing headers, I'd actually use Replicator as only a minor
> supporting case. The reason is that it is a very uncommon case where there
> is zero impedance mismatch between the source and sink of the data since
> they are both Kafka. This means you don't need to think much about data
> formats/serialization. I think the JMS use case is a better example since
> JMS headers and Kafka headers don't quite match up. Here's a quick list of
> use cases I can think of off the top of my head:
> 
> 1. Include headers from other systems that support them: JMS (or really any
> MQ), HTTP
> 2. Other connector-specific headers. For example, from JDBC maybe the table
> the data comes from is a header; for a CDC connector you might include the
> binlog offset as a header.
> 3. Interceptor/SMT-style use cases for annotating things like provenance of
> data:
> 3a. Generically w/ user-supplied data like data center, host, app ID, etc.
> 3b. Kafka Connect framework level info, such as the connector/task
> generating the data
> 
> On deviation from Connect's model -- to be honest, the KIP-82 also deviates
> quite substantially from how Kafka handles data already, so we may struggle
> a bit to rectify the two. (In particular, headers specify some structure
> and enforce strings specifically for header keys, but then require you to
> do serialization of header values yourself...).
> 
> I think the use cases I mentioned above may also need different approaches
> to how the data in headers are handled. As Gwen mentions, if we expose the
> headers to Connectors, they need to have some idea of the format and the
> reason for byte[] values in KIP-82 is to leave that decision up to the
> organization using them. But without knowing the format, connectors can't
> really do anything with them -- if a source connector assumes a format,
> they may generate data incompatible with the format used by the rest of the
> organization. On the other hand, I have a feeling most people will just use
>  headers, so allowing connectors to embed arbitrarily
> complex data may not work out well in practice. Or maybe we leave it
> flexible, most people default to using StringConverter for the serializer
> and Connectors will end up defaulting to that just for compatibility...
> 
> I'm not sure I have a real proposal yet, but I do think understanding the
> impact of using a Converter for headers would be useful, and we might want
> to think about how this KIP would fit in with transformations (or if that
> is something that can be deferred, handled separately from the existing
> transformations, etc).
> 
> -Ewen
> 
> On Mon, May 1, 2017 at 11:52 AM, Michael Pearce 
> wrote:
> 
>> Hi Gwen,
>> 
>> Then intent here was to allow tools that perform similar role to mirror
>> makers of replicating the messaging from one cluster to another.  Eg like
>> mirror make should just be taking and transferring the headers as is.
>> 
>> We don't actually use this inside our company, so not exposing this isn't
>> an issue for us. Just believe there are companies like confluent who have
>> tools like replicator that do.
>> 
>> And as good citizens think we should complete the work and expose the
>> headers same as in the record to at least allow them to replicate the
>> messages as is. Note Steph seems to want it.
>> 
>> Cheers
>> Mike
>> 
>> Sent using OWA for iPhone
>> 
>> From: Gwen Shapira 
>> Sent: Monday, May 1, 2017 2:36:34 PM
>> To: dev@kafka.apache.org
>> Subject: 

[GitHub] kafka pull request #2960: KAFKA-4343: Expose Connector type in REST API

2017-05-02 Thread norwood
GitHub user norwood opened a pull request:

https://github.com/apache/kafka/pull/2960

KAFKA-4343: Expose Connector type in REST API


https://cwiki.apache.org/confluence/display/KAFKA/KIP-151+Expose+Connector+type+in+REST+API

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/norwood/kafka KIP-151

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2960.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2960


commit 412d137876e2bef3139643a13beeffd18201419a
Author: dan norwood 
Date:   2017-05-02T20:41:50Z

KIP-151: Expose Connector type in REST API 



https://cwiki.apache.org/confluence/display/KAFKA/KIP-151+Expose+Connector+type+in+REST+API




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4343) Connect REST API should expose whether each connector is a source or sink

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993709#comment-15993709
 ] 

ASF GitHub Bot commented on KAFKA-4343:
---

GitHub user norwood opened a pull request:

https://github.com/apache/kafka/pull/2960

KAFKA-4343: Expose Connector type in REST API


https://cwiki.apache.org/confluence/display/KAFKA/KIP-151+Expose+Connector+type+in+REST+API

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/norwood/kafka KIP-151

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2960.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2960


commit 412d137876e2bef3139643a13beeffd18201419a
Author: dan norwood 
Date:   2017-05-02T20:41:50Z

KIP-151: Expose Connector type in REST API 



https://cwiki.apache.org/confluence/display/KAFKA/KIP-151+Expose+Connector+type+in+REST+API




> Connect REST API should expose whether each connector is a source or sink
> -
>
> Key: KAFKA-4343
> URL: https://issues.apache.org/jira/browse/KAFKA-4343
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Ewen Cheslack-Postava
>Assignee: dan norwood
>  Labels: needs-kip, newbie++
>
> Currently we don't expose information about whether a connector is a source 
> or sink. This is useful when, e.g., categorizing connectors in a UI. Given 
> naming conventions we try to encourage you *might* be able to determine this 
> via the connector's class name, but that isn't reliable.
> This may be related to https://issues.apache.org/jira/browse/KAFKA-4279 as we 
> might just want to expose this information on a per-connector-plugin basis 
> and expect any users to tie the results from multiple API requests together. 
> An alternative that would probably be simpler for users would be to include 
> it in the /connectors//status output.
> Note that this will require a KIP as it adds to the public REST API.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-02 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993902#comment-15993902
 ] 

Lukas Gemela edited comment on KAFKA-5154 at 5/2/17 10:27 PM:
--

Hi,

we have a kafka streams app consuming from one topic, aggregating data on 
another internal topic (by using in-memory logged store) and then outputting 
them to out topic.  
Input topic has I believe 40 partitions and therefore internal topic has 40 
partitions as well.
We are running 8 parallel nodes of this app at all times.
We are using low level kafka streams API.
None of our code is blocking nor asynchronous.

I'm not really sure what caused the rebalance as all logs from all nodes are 
lost now, I've just seen couple of illegalStateExceptions on another nodes, 
coming from kafka streams stack, but I'm not even sure if they were thrown 
around the same time as NPE happened on this node.

Hope it helps!


was (Author: lukas gemela):
Hi,

we have a kafka streams app consuming from one topic, aggregating data on 
another internal topic (by using in-memory logged store) and then outputting 
them to out topic.  
Input topic has I believe 40 partitions and therefore internal topic has 40 
partitions as well.
We are running 8 parallel nodes of this app at all times.
We are using low level kafka streams API.
Non of our calls is blocking nor asynchronous.

I'm not really sure what caused the rebalance as all logs from all nodes are 
lost now, I've just seen couple of illegalStateExceptions on another nodes, 
coming from kafka streams stack, but I'm not even sure if they were thrown 
around the same time as NPE happened on this node.

Hope it helps!

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T18:04:25,993 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T18:04:29,401 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-11, 
> 

[jira] [Comment Edited] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-02 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993902#comment-15993902
 ] 

Lukas Gemela edited comment on KAFKA-5154 at 5/2/17 10:33 PM:
--

Hi,

we have a kafka streams app consuming from one topic, aggregating data on 
another internal topic (by using in-memory logged store) and then outputting 
them to out topic.  
Input topic has I believe 40 partitions and therefore internal topic has 40 
partitions as well.
We are running 8 parallel nodes of this app at all times.
We are using low level kafka streams API.
None of our code is blocking nor asynchronous.

I'm not really sure what caused the rebalance as all logs from all nodes are 
lost now, I've just seen couple of illegalStateExceptions on another nodes, 
coming from kafka streams stack, but I'm not even sure if they were thrown 
around the same time as NPE happened on this node.

Please also note that in our exception handler we try to stop kafka streams app 
with our custom timeout and schedule a spring task to start it again. As far I 
can tell calling 

boolean isClosed = streams.close(10, TimeUnit.SECONDS)

never actually succeed and we always get isClosed = false.

Hope it helps!

Lukas


was (Author: lukas gemela):
Hi,

we have a kafka streams app consuming from one topic, aggregating data on 
another internal topic (by using in-memory logged store) and then outputting 
them to out topic.  
Input topic has I believe 40 partitions and therefore internal topic has 40 
partitions as well.
We are running 8 parallel nodes of this app at all times.
We are using low level kafka streams API.
None of our code is blocking nor asynchronous.

I'm not really sure what caused the rebalance as all logs from all nodes are 
lost now, I've just seen couple of illegalStateExceptions on another nodes, 
coming from kafka streams stack, but I'm not even sure if they were thrown 
around the same time as NPE happened on this node.

Hope it helps!

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T18:04:25,993 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered 

[jira] [Comment Edited] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-02 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993902#comment-15993902
 ] 

Lukas Gemela edited comment on KAFKA-5154 at 5/2/17 10:33 PM:
--

Hi,

we have a kafka streams app consuming from one topic, aggregating data on 
another internal topic (by using in-memory logged store) and then outputting 
them to out topic.  
Input topic has I believe 40 partitions and therefore internal topic has 40 
partitions as well.
We are running 8 parallel nodes of this app at all times.
We are using low level kafka streams API.
None of our code is blocking nor asynchronous.

I'm not really sure what caused the rebalance as all logs from all nodes are 
lost now, I've just seen couple of illegalStateExceptions on another nodes, 
coming from kafka streams stack, but I'm not even sure if they were thrown 
around the same time as NPE happened on this node.


___
Probably not important but might be related to the issue:
Please also note that in our exception handler we try to stop kafka streams app 
with our custom timeout and schedule a spring task to start it again. As far I 
can tell calling 

boolean isClosed = streams.close(10, TimeUnit.SECONDS)

never actually succeed and we always get isClosed = false.

Hope it helps!

Lukas


was (Author: lukas gemela):
Hi,

we have a kafka streams app consuming from one topic, aggregating data on 
another internal topic (by using in-memory logged store) and then outputting 
them to out topic.  
Input topic has I believe 40 partitions and therefore internal topic has 40 
partitions as well.
We are running 8 parallel nodes of this app at all times.
We are using low level kafka streams API.
None of our code is blocking nor asynchronous.

I'm not really sure what caused the rebalance as all logs from all nodes are 
lost now, I've just seen couple of illegalStateExceptions on another nodes, 
coming from kafka streams stack, but I'm not even sure if they were thrown 
around the same time as NPE happened on this node.

Please also note that in our exception handler we try to stop kafka streams app 
with our custom timeout and schedule a spring task to start it again. As far I 
can tell calling 

boolean isClosed = streams.close(10, TimeUnit.SECONDS)

never actually succeed and we always get isClosed = false.

Hope it helps!

Lukas

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 

[jira] [Comment Edited] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-02 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993902#comment-15993902
 ] 

Lukas Gemela edited comment on KAFKA-5154 at 5/2/17 10:34 PM:
--

Hi,

we have a kafka streams app consuming from one topic, aggregating data on 
another internal topic (by using in-memory logged store) and then outputting 
them to out topic.  
Input topic has I believe 40 partitions and therefore internal topic has 40 
partitions as well.
We are running 8 parallel nodes of this app at all time.
We are using low level kafka streams API.
None of our code is blocking nor asynchronous.

I'm not really sure what caused the rebalance as all logs from all nodes are 
lost now, I've just seen couple of illegalStateExceptions on another nodes, 
coming from kafka streams stack, but I'm not even sure if they were thrown 
around the same time as NPE happened on this node.


___
Probably not important but might be related to the issue:
Please also note that in our exception handler we try to stop kafka streams app 
with our custom timeout and schedule a spring task to start it again. As far I 
can tell calling 

boolean isClosed = streams.close(10, TimeUnit.SECONDS)

never actually succeed and we always get isClosed = false.

Hope it helps!

Lukas


was (Author: lukas gemela):
Hi,

we have a kafka streams app consuming from one topic, aggregating data on 
another internal topic (by using in-memory logged store) and then outputting 
them to out topic.  
Input topic has I believe 40 partitions and therefore internal topic has 40 
partitions as well.
We are running 8 parallel nodes of this app at all times.
We are using low level kafka streams API.
None of our code is blocking nor asynchronous.

I'm not really sure what caused the rebalance as all logs from all nodes are 
lost now, I've just seen couple of illegalStateExceptions on another nodes, 
coming from kafka streams stack, but I'm not even sure if they were thrown 
around the same time as NPE happened on this node.


___
Probably not important but might be related to the issue:
Please also note that in our exception handler we try to stop kafka streams app 
with our custom timeout and schedule a spring task to start it again. As far I 
can tell calling 

boolean isClosed = streams.close(10, TimeUnit.SECONDS)

never actually succeed and we always get isClosed = false.

Hope it helps!

Lukas

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> 

Re: Kafka Connect and Partitions

2017-05-02 Thread Randall Hauch
Hi, David.

Excellent. I'm glad that you've solved the puzzle.
Best regards,

Randall

On Tue, May 2, 2017 at 9:18 AM,  wrote:

> Hi Gwen/Randall,
>
> I think I've finally understood, more or less, how partitioning relates to
> SourceRecords.
>
> Because I was using the SourceRecord constructor that doesn't provide
> values for key and key schema, the key is null.  The DefaultPartioner
> appears to map null to a constant value rather than round-robin across all
> of the partitions :(
> SourceRecord(Map sourcePartition, Map
> sourceOffset, String topic, Schema valueSchema, Object value)
>
> Another SourceRecord constructor enables the partition to be specified but
> I'd prefer not to use this as I don't want to couple the non-Kafka source
> side to Kafka by making it aware of topic partitions - this would also
> presumably involve coupling it to the Cluster so that the number of
> partitions in a topic can be determined :(
> SourceRecord(Map sourcePartition, Map
> sourceOffset, String topic, Integer partition, Schema keySchema, Object
> key, Schema valueSchema, Object value)
>
> Instead, if I use the SourceRecord constructor that also takes arguments
> for the key and key schema (making them take the same values as the value
> and value schema in my application), then the custom partitioner /
> producer.partitioner.class property is not required and the data is
> distributed across the partitions :)
> SourceRecord(Map sourcePartition, Map
> sourceOffset, String topic, Integer partition, Schema keySchema, Object
> key, Schema valueSchema, Object value)
>
> Many thanks once again for your guidance.  I think this puzzle is now
> solved :)
> Best wishes,
> David
>
> -Original Message-
> From: Randall Hauch [mailto:rha...@gmail.com]
> Sent: 28 April 2017 16:08
> To: dev@kafka.apache.org
> Subject: Re: Kafka Connect and Partitions
>
> The source connector creates SourceRecord object and can set a number of
> fields, including the message's key and value, the Kafka topic name and, if
> desired, the Kafka topic partition number. If the connector does se the the
> topic partition to a non-null value, then that's the partition to which
> Kafka Connect will write the message; otherwise, the customer partitioner
> (e.g., your custom partitioner) used by the Kafka Connect producer will
> choose/compute the partition based purely upon the key and value byte
> arrays. Note that if the connector doesn't set the topic partition number
> and no special producer partitioner is specified, the default hash-based
> Kafka partitioner will be used.
>
> So, the connector can certainly set the topic partition number, and it may
> be easier to do it there since the connector actually has the key and
> values before they are serialized. But no matter what, the connector is the
> only thing that sets the message key in the source record.
>
> BTW, the SourceRecord's "source position" and "source offset" are actually
> the connector-defined information about the source and where the connector
> has read in that source. Don't confuse these with the topic name or topic
> partition number.
>
> Hope that helps.
>
> Randall
>
> On Fri, Apr 28, 2017 at 7:15 AM,  wrote:
>
> > Hi Gwen,
> >
> > Having added a custom partitioner (via the producer.partitioner.class
> > property in worker.properties) that simply randomly selects a partition,
> > the data is now written evenly across all the partitions :)
> >
> > The root of my confusion regarding why the default partitioner writes all
> > data to the same partition is that I don't understand how the
> SourceRecords
> > returned in the source task poll() method are used by the partitioner.
> The
> > data that is passed to the partitioner includes a key Object (which is an
> > empty byte array - presumably this is a bad idea!), and a value Object
> > (which is a non-empty byte array):
> >
> > @Override
> > public int partition(String topic, Object key, byte[] keyBytes,
> Object
> > value, byte[] valueBytes, Cluster cluster) {
> > System.out.println(String.format(
> > "### PARTITION key[%s][%s][%d] value[%s][%s][%d]",
> > key, key.getClass().getSimpleName(), keyBytes.length,
> > value, value.getClass().getSimpleName(),
> > valueBytes.length));
> >
> > =>
> > ### PARTITION key[[B@584f599f][byte[]][0] value[[B@73cc0cd8][byte[]][
> 236]
> >
> > However, I don't understand how the above key and value are derived from
> > the SourceRecord attributes which, in my application's case, is as
> follows:
> >
> > events.add(new SourceRecord(
> > offsetKey(filename),
> > offsetValue(++recordIndex),
> > topicName,
> > Schema.BYTES_SCHEMA,
> > line));
> > 

[jira] [Assigned] (KAFKA-4343) Connect REST API should expose whether each connector is a source or sink

2017-05-02 Thread dan norwood (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dan norwood reassigned KAFKA-4343:
--

Assignee: dan norwood

> Connect REST API should expose whether each connector is a source or sink
> -
>
> Key: KAFKA-4343
> URL: https://issues.apache.org/jira/browse/KAFKA-4343
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Ewen Cheslack-Postava
>Assignee: dan norwood
>  Labels: needs-kip, newbie++
>
> Currently we don't expose information about whether a connector is a source 
> or sink. This is useful when, e.g., categorizing connectors in a UI. Given 
> naming conventions we try to encourage you *might* be able to determine this 
> via the connector's class name, but that isn't reliable.
> This may be related to https://issues.apache.org/jira/browse/KAFKA-4279 as we 
> might just want to expose this information on a per-connector-plugin basis 
> and expect any users to tie the results from multiple API requests together. 
> An alternative that would probably be simpler for users would be to include 
> it in the /connectors//status output.
> Note that this will require a KIP as it adds to the public REST API.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2959: Created convenience method to create ZkUtils

2017-05-02 Thread bbaugher
GitHub user bbaugher opened a pull request:

https://github.com/apache/kafka/pull/2959

Created convenience method to create ZkUtils



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bbaugher/kafka zkutils

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2959.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2959


commit b1e201bd0c80e9f842a0ca9f83fc9bd0f9793075
Author: Bryan Baugher 
Date:   2017-05-02T20:30:58Z

Created convenience method to create ZkUtils




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #2961: MINOR: Serialize the real isolationLevel in FetchR...

2017-05-02 Thread apurvam
GitHub user apurvam opened a pull request:

https://github.com/apache/kafka/pull/2961

MINOR: Serialize the real isolationLevel in FetchRequest



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apurvam/kafka 
MINOR-serialize-isolation-level-in-fetch-request

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2961.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2961


commit 1caddec12bb12af61dec2fe6909bf04527a9e351
Author: Apurva Mehta 
Date:   2017-05-02T22:12:35Z

Serialize the real isolationLevel in FetchRequest




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-02 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993902#comment-15993902
 ] 

Lukas Gemela commented on KAFKA-5154:
-

Hi,

we have a kafka streams app consuming from one topic, aggregating data on 
another internal topic (by using in-memory logged store) and then outputting 
them to out topic.  
Input topic has I believe 40 partitions and therefore internal topic has 40 
partitions as well.
We are running 8 parallel nodes of this app at all times.
We are using low level kafka streams API.
Non of our calls is blocking nor asynchronous.

I'm not really sure what caused the rebalance as all logs from all nodes are 
lost now, I've just seen couple of illegalStateExceptions on another nodes, 
coming from kafka streams stack, but I'm not even sure if they were thrown 
around the same time as NPE happened on this node.

Hope it helps!

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T18:04:25,993 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T18:04:29,401 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T18:05:10,877 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-05-01T00:01:55,707 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-05-01T00:01:59,027 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-05-01T00:01:59,031 ERROR 

[jira] [Created] (KAFKA-5164) SetSchemaMetadata does not replace the schemas in structs correctly

2017-05-02 Thread Ewen Cheslack-Postava (JIRA)
Ewen Cheslack-Postava created KAFKA-5164:


 Summary: SetSchemaMetadata does not replace the schemas in structs 
correctly
 Key: KAFKA-5164
 URL: https://issues.apache.org/jira/browse/KAFKA-5164
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 0.10.2.1
Reporter: Ewen Cheslack-Postava


In SetSchemaMetadataTest we verify that the name and version of the schema in 
the record have been replaced:

https://github.com/apache/kafka/blob/trunk/connect/transforms/src/test/java/org/apache/kafka/connect/transforms/SetSchemaMetadataTest.java#L62

However, in the case of Structs, the schema will be attached to both the record 
and the Struct itself. So we correctly rebuild the Record:

https://github.com/apache/kafka/blob/trunk/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L77
https://github.com/apache/kafka/blob/trunk/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L104
https://github.com/apache/kafka/blob/trunk/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L119

But if the key or value are a struct, they will still contain the old schema 
embedded in the struct.

Ultimately this can lead to validations in other code failing (even for very 
simple changes like adjusting the name of a schema):

{code}
(org.apache.kafka.connect.runtime.WorkerTask:141)
org.apache.kafka.connect.errors.DataException: Mismatching struct schema
at io.confluent.connect.avro.AvroData.fromConnectData(AvroData.java:471)
at io.confluent.connect.avro.AvroData.fromConnectData(AvroData.java:295)
at 
io.confluent.connect.avro.AvroConverter.fromConnectData(AvroConverter.java:73)
at 
org.apache.kafka.connect.runtime.WorkerSourceTask.sendRecords(WorkerSourceTask.java:196)
at 
org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:167)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}

The solution to this is probably to check whether we're dealing with a Struct 
when we use a new schema and potentially copy/reallocate it.

This particular issue would only appear if we don't modify the data, so I think 
SetSchemaMetadata is currently the only transformation that would have the 
issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-02 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993902#comment-15993902
 ] 

Lukas Gemela edited comment on KAFKA-5154 at 5/2/17 10:35 PM:
--

Hi,

we have a kafka streams app consuming from one topic, aggregating data on 
another internal topic (by using in-memory logged store) and then outputting 
them to out topic.  
Input topic has I believe 40 partitions and therefore internal topic has 40 
partitions as well.
We are running 8 parallel nodes of this app all the time.
We are using low level kafka streams API.
None of our code is blocking nor asynchronous.

I'm not really sure what caused the rebalance as all logs from all nodes are 
lost now, I've just seen couple of illegalStateExceptions on another nodes, 
coming from kafka streams stack, but I'm not even sure if they were thrown 
around the same time as NPE happened on this node.


___
Probably not important but might be related to the issue:
Please also note that in our exception handler we try to stop kafka streams app 
with our custom timeout and schedule a spring task to start it again. As far I 
can tell calling 

boolean isClosed = streams.close(10, TimeUnit.SECONDS)

never actually succeed and we always get isClosed = false.

Hope it helps!

Lukas


was (Author: lukas gemela):
Hi,

we have a kafka streams app consuming from one topic, aggregating data on 
another internal topic (by using in-memory logged store) and then outputting 
them to out topic.  
Input topic has I believe 40 partitions and therefore internal topic has 40 
partitions as well.
We are running 8 parallel nodes of this app at all time.
We are using low level kafka streams API.
None of our code is blocking nor asynchronous.

I'm not really sure what caused the rebalance as all logs from all nodes are 
lost now, I've just seen couple of illegalStateExceptions on another nodes, 
coming from kafka streams stack, but I'm not even sure if they were thrown 
around the same time as NPE happened on this node.


___
Probably not important but might be related to the issue:
Please also note that in our exception handler we try to stop kafka streams app 
with our custom timeout and schedule a spring task to start it again. As far I 
can tell calling 

boolean isClosed = streams.close(10, TimeUnit.SECONDS)

never actually succeed and we always get isClosed = false.

Hope it helps!

Lukas

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> 

[jira] [Updated] (KAFKA-5127) Replace pattern matching with foreach where the case None is unused

2017-05-02 Thread Balint Molnar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balint Molnar updated KAFKA-5127:
-
Status: Patch Available  (was: In Progress)

> Replace pattern matching with foreach where the case None is unused 
> 
>
> Key: KAFKA-5127
> URL: https://issues.apache.org/jira/browse/KAFKA-5127
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Balint Molnar
>Assignee: Balint Molnar
>Priority: Minor
>
> There are various place where pattern matching is used with matching only for 
> one thing and ignoring the None type, this can be replaced with foreach.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4307) Inconsistent parameters between console producer and consumer

2017-05-02 Thread Balint Molnar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balint Molnar updated KAFKA-4307:
-
Status: Patch Available  (was: In Progress)

> Inconsistent parameters between console producer and consumer
> -
>
> Key: KAFKA-4307
> URL: https://issues.apache.org/jira/browse/KAFKA-4307
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.10.1.0
>Reporter: Gwen Shapira
>Assignee: Balint Molnar
>  Labels: newbie
>
> kafka-console-producer uses --broker-list while kafka-console-consumer uses 
> --bootstrap-server.
> Let's add --bootstrap-server to the producer for some consistency?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4703) Test with two SASL_SSL listeners with different JAAS contexts

2017-05-02 Thread Balint Molnar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balint Molnar updated KAFKA-4703:
-
Status: Patch Available  (was: In Progress)

> Test with two SASL_SSL listeners with different JAAS contexts
> -
>
> Key: KAFKA-4703
> URL: https://issues.apache.org/jira/browse/KAFKA-4703
> Project: Kafka
>  Issue Type: Test
>Reporter: Ismael Juma
>Assignee: Balint Molnar
>  Labels: newbie
>
> [~rsivaram] suggested the following in 
> https://github.com/apache/kafka/pull/2406
> {quote}
> I think this feature allows two SASL_SSL listeners, one for external and one 
> for internal and the two can use different mechanisms and different JAAS 
> contexts. That makes the multi-mechanism configuration neater. I think it 
> will be useful to have an integration test for this, perhaps change 
> SaslMultiMechanismConsumerTest.
> {quote}
> And my reply:
> {quote}
> I think it's a bit tricky to support multiple listeners in 
> KafkaServerTestHarness. Maybe it's easier to do the test you suggest in 
> MultipleListenersWithSameSecurityProtocolTest.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[VOTE] KIP-137: Enhance TopicCommand --describe to show topics marked for deletion

2017-05-02 Thread Mickael Maison
Hi,

I'd like to start the vote on KIP-137:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-137%3A+Enhance+TopicCommand+--describe+to+show+topics+marked+for+deletion

Thanks,


[jira] [Commented] (KAFKA-4772) Exploit #peek to implement #print() and other methods

2017-05-02 Thread james chien (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15992704#comment-15992704
 ] 

james chien commented on KAFKA-4772:


my new PR is here https://github.com/apache/kafka/pull/2955

> Exploit #peek to implement #print() and other methods
> -
>
> Key: KAFKA-4772
> URL: https://issues.apache.org/jira/browse/KAFKA-4772
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: james chien
>Priority: Minor
>  Labels: beginner, newbie
>
> From: https://github.com/apache/kafka/pull/2493#pullrequestreview-22157555
> Things that I can think of:
> - print / writeAsTest can be a special impl of peek; KStreamPrint etc can be 
> removed.
> - consider collapse KStreamPeek with KStreamForeach with a flag parameter 
> indicating if the acted key-value pair should still be forwarded.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2924: change connection refused message from level DEBUG...

2017-05-02 Thread jedichien
Github user jedichien closed the pull request at:

https://github.com/apache/kafka/pull/2924


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #2955: Replace KeyValuePrinter and KStreamForeach with KS...

2017-05-02 Thread jedichien
GitHub user jedichien opened a pull request:

https://github.com/apache/kafka/pull/2955

Replace KeyValuePrinter and KStreamForeach with KStreamPeek

I remove `KeyValuePrinter` and `KStreamForeach` two class, then implements 
them by `KStreamPeek`.
So, now `KStreamPeek` can do `KeyValuePrinter` and `KStreamForeach` job.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jedichien/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2955.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2955


commit da4bcbaa90ed2107128914f7d6ca92add6e616db
Author: jameschien 
Date:   2017-04-27T08:00:20Z

change connection refused message from level DEBUG into WARN.

commit 1adfd44424f2760def0a648c96129243088ef044
Author: jameschien 
Date:   2017-05-02T08:26:04Z

replace KeyValuePrinter and KStreamForeach with KStreamPeek

commit 407b48b83d3acd8e7803bd32e93aa28e48c446a9
Author: jameschien 
Date:   2017-05-02T10:45:23Z

replace KeyValuePrinter and KStreamForeach with KStreamPeek

commit 9c3caa19724606e2e8982ea1c63ce8a480945e73
Author: jameschien 
Date:   2017-05-02T10:51:01Z

revert




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-5155) Messages can be deleted prematurely when some producers use timestamps and some not

2017-05-02 Thread JIRA
Petr Plavjaník created KAFKA-5155:
-

 Summary: Messages can be deleted prematurely when some producers 
use timestamps and some not
 Key: KAFKA-5155
 URL: https://issues.apache.org/jira/browse/KAFKA-5155
 Project: Kafka
  Issue Type: Bug
  Components: log
Affects Versions: 0.10.2.0
Reporter: Petr Plavjaník


Some messages can be deleted prematurely and never read in following scenario. 
A producer uses timestamps and produces messages that are appended to the 
beginning of a log segment. Other producer produces messages without a 
timestamp. In that case the largest timestamp is made by the old messages with 
a timestamp and new messages with the timestamp does not influence and the log 
segment with old and new messages can be delete immediately after the last new 
message with no timestamp is appended. When all appended messages have no 
timestamp, then they are not deleted because {{lastModified}} attribute of a 
{{LogSegment}} is used.

New test case to {{kafka.log.LogTest}} that fails:
{code}
  @Test
  def 
shouldNotDeleteTimeBasedSegmentsWhenTimestampIsNotProvidedForSomeMessages() {
val retentionMs = 1000
val old = TestUtils.singletonRecords("test".getBytes, timestamp = 0)
val set = TestUtils.singletonRecords("test".getBytes, timestamp = -1)
val log = createLog(set.sizeInBytes, retentionMs = retentionMs)

// append some messages to create some segments
log.append(old)
for (_ <- 0 until 14)
  log.append(set)

log.deleteOldSegments()
assertEquals("There should be 3 segments remaining", 3, 
log.numberOfSegments)
  }
{code}

It can be prevented by using {{def largestTimestamp = 
Math.max(maxTimestampSoFar, lastModified)}} in LogSegment, or by using current 
timestamp when messages with timestamp {{-1}} are appended.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [VOTE] KIP-137: Enhance TopicCommand --describe to show topics marked for deletion

2017-05-02 Thread Ismael Juma
Thanks for the KIP, +1 (binding).

Ismael

On Tue, May 2, 2017 at 11:43 AM, Mickael Maison 
wrote:

> Hi,
>
> I'd like to start the vote on KIP-137:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 137%3A+Enhance+TopicCommand+--describe+to+show+topics+marked+for+deletion
>
> Thanks,
>


[GitHub] kafka pull request #2954: MINOR: Fix error logged if not enough alive broker...

2017-05-02 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/2954

MINOR: Fix error logged if not enough alive brokers for transactions state 
topic



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
fix-error-message-if-transactions-topic-replication-factor-too-low

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2954.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2954


commit 3e1da7d99da95644cef7cc83c04448932f4907ca
Author: Ismael Juma 
Date:   2017-05-02T10:52:52Z

MINOR: Fix error logged if not enough alive brokers for transactions state 
topic




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4422) Drop support for Scala 2.10

2017-05-02 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15992774#comment-15992774
 ] 

Ismael Juma commented on KAFKA-4422:


For 0.11.0.0, we will remove support for Scala 2.10, but we won't use features 
that are only available in Scala 2.11 to make it easier for people to build 
artifacts for Scala 2.10 if they need them (LinkedIn said that this would help 
them). In the next release cycle (probably 0.11.1.0), we will be able to use 
features that are only available in Scala 2.11.

> Drop support for Scala 2.10
> ---
>
> Key: KAFKA-4422
> URL: https://issues.apache.org/jira/browse/KAFKA-4422
> Project: Kafka
>  Issue Type: Task
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>  Labels: kip
> Fix For: 0.11.0.0
>
>
> Now that Scala 2.12 has been released, we should drop support for Scala 2.10  
> in the next major Kafka version so that we keep the number of supported 
> versions at 2. Since we have to compile and run the tests on each supported 
> version, there is a non-trivial cost from a development and testing 
> perspective.
> The clients library is in Java and we recommend people use the Java clients 
> instead of the Scala ones, so dropping support for Scala 2.10 should have a 
> smaller impact than it would have had in the past. Scala 2.10 was released in 
> January 2013 and support ended in March 2015. 
> Once we drop support for Scala 2.10, we can take advantage of APIs and 
> compiler improvements introduced in Scala 2.11 (introduced in April 2014): 
> http://scala-lang.org/news/2.11.0
> Link to KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-119%3A+Drop+Support+for+Scala+2.10+in+Kafka+0.11



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Request to add to the contributor list

2017-05-02 Thread Amit Daga
Hello,

Wondering if you were able to look into it?

Thanks,
Amit Daga

On Sun, Apr 30, 2017 at 1:29 PM, Amit Daga  wrote:

> Hello Team,
>
> Hope you are doing well.
>
> My name is Amit Daga. This is to request you to add me to the contributor
> list. Also if KAFKA-4996 issue is still not assigned, I would like to work
> on it.
>
> Thanks,
> Amit
>


[GitHub] kafka pull request #2952: MINOR: fix some c/p error that was causing describ...

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2952


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[DISCUSS] KIP 151 - Expose Connector type in REST API

2017-05-02 Thread dan
hello.

in an attempt to make the connect rest endpoints more useful i'd like to
add the Connector type (Sink/Source) in our rest endpoints to make them
more self descriptive.

KIP here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-151+Expose+Connector+type+in+REST+API
initial pr: https://github.com/apache/kafka/pull/2960

thanks
dan


Re: [VOTE] KIP-118: Drop Support for Java 7 in Kafka 0.11

2017-05-02 Thread Ismael Juma
Hi all,

As I looked deeper into the details of the impact of this change, it became
apparent to me that it may better to postpone it to a future release.
Reasons follow:

1. Once we move to Java 8, there will be a ripple effect in all projects
that depend on the Kafka clients jar. Given where we are in the release
cycle[2], we would prefer to focus our resources on ensuring a high quality
release despite the extensive changes to core parts of Kafka (message
format, replication protocol, single-threaded controller, transactions,
etc.). In other words, improving test coverage, stabilising flaky tests (as
they may be hiding real bugs), fixing issues identified, etc. is more
valuable to our users than bumping the minimum JDK version.

2. If we bump the version, exactly-once (idempotent producer and
transactions) and headers won't be available for clients that are stuck
using Java 7. In addition, taking advantage of KIP-101 (an important
improvement to the replication protocol) would require down conversion (and
consequent performance impact) if there are Java 7 clients. Users tend to
have less control on client environments and upgrading the JDK version is
challenging. Our research indicates that this is not uncommon.

3. Even though Oracle no longer publishes security fixes for Java 7 freely
(a support contract is required), Red Hat continues to publish them and
will do so until June 2018[1].

Given the above, I suggest we postpone the Java 8 switch to a subsequent
major release. It's a bit frustrating, but I think this is the right
decision for our
users. Are there any objections?

Ismael

[1] https://access.redhat.com/articles/1299013
[2] https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+0.11.0.0

On Tue, Feb 28, 2017 at 4:11 AM, Becket Qin  wrote:

> +1
>
> On Mon, Feb 27, 2017 at 6:38 PM, Ismael Juma  wrote:
>
> > Thanks to everyone who voted and provided feedback. +1 (binding) from me
> > too.
> >
> > The vote has passed with 4 binding votes (Grant, Jason, Guozhang, Ismael)
> > and 11 non-binding votes (Bill, Damian, Eno, Edoardo, Mickael, Bharat,
> > Onur, Vahid, Colin, Apurva, Tom). There were no 0 or -1 votes.
> >
> > I have updated the relevant wiki pages.
> >
> > Ismael
> >
> > On Thu, Feb 9, 2017 at 3:31 PM, Ismael Juma  wrote:
> >
> > > Hi everyone,
> > >
> > > Since everyone in the discuss thread was in favour (10 people
> responded),
> > > I would like to initiate the voting process for KIP-118: Drop Support
> for
> > > Java 7 in Kafka 0.11:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 118%3A+Drop+Support+for+Java+7+in+Kafka+0.11
> > >
> > > The vote will run for a minimum of 72 hours.
> > >
> > > Thanks,
> > > Ismael
> > >
> > >
> >
>


Build failed in Jenkins: kafka-trunk-jdk7 #2139

2017-05-02 Thread Apache Jenkins Server
See 


Changes:

[ismael] KAFKA-5131; WriteTxnMarkers and complete commit/abort on partition

--
[...truncated 849.20 KB...]
kafka.server.DynamicConfigChangeTest > testDefaultClientIdQuotaConfigChange 
PASSED

kafka.server.DynamicConfigChangeTest > testQuotaInitialization STARTED

kafka.server.DynamicConfigChangeTest > testQuotaInitialization PASSED

kafka.server.DynamicConfigChangeTest > testUserQuotaConfigChange STARTED

kafka.server.DynamicConfigChangeTest > testUserQuotaConfigChange PASSED

kafka.server.DynamicConfigChangeTest > testClientIdQuotaConfigChange STARTED

kafka.server.DynamicConfigChangeTest > testClientIdQuotaConfigChange PASSED

kafka.server.DynamicConfigChangeTest > testUserClientIdQuotaChange STARTED

kafka.server.DynamicConfigChangeTest > testUserClientIdQuotaChange PASSED

kafka.server.DynamicConfigChangeTest > shouldParseReplicationQuotaProperties 
STARTED

kafka.server.DynamicConfigChangeTest > shouldParseReplicationQuotaProperties 
PASSED

kafka.server.DynamicConfigChangeTest > 
shouldParseRegardlessOfWhitespaceAroundValues STARTED

kafka.server.DynamicConfigChangeTest > 
shouldParseRegardlessOfWhitespaceAroundValues PASSED

kafka.server.DynamicConfigChangeTest > testDefaultUserQuotaConfigChange STARTED

kafka.server.DynamicConfigChangeTest > testDefaultUserQuotaConfigChange PASSED

kafka.server.DynamicConfigChangeTest > shouldParseReplicationQuotaReset STARTED

kafka.server.DynamicConfigChangeTest > shouldParseReplicationQuotaReset PASSED

kafka.server.DynamicConfigChangeTest > testDefaultUserClientIdQuotaConfigChange 
STARTED

kafka.server.DynamicConfigChangeTest > testDefaultUserClientIdQuotaConfigChange 
PASSED

kafka.server.DynamicConfigChangeTest > testConfigChangeOnNonExistingTopic 
STARTED

kafka.server.DynamicConfigChangeTest > testConfigChangeOnNonExistingTopic PASSED

kafka.server.DynamicConfigChangeTest > testConfigChange STARTED

kafka.server.DynamicConfigChangeTest > testConfigChange PASSED

kafka.server.ServerGenerateBrokerIdTest > testGetSequenceIdMethod STARTED

kafka.server.ServerGenerateBrokerIdTest > testGetSequenceIdMethod PASSED

kafka.server.ServerGenerateBrokerIdTest > testBrokerMetadataOnIdCollision 
STARTED

kafka.server.ServerGenerateBrokerIdTest > testBrokerMetadataOnIdCollision PASSED

kafka.server.ServerGenerateBrokerIdTest > testAutoGenerateBrokerId STARTED

kafka.server.ServerGenerateBrokerIdTest > testAutoGenerateBrokerId PASSED

kafka.server.ServerGenerateBrokerIdTest > testMultipleLogDirsMetaProps STARTED

kafka.server.ServerGenerateBrokerIdTest > testMultipleLogDirsMetaProps PASSED

kafka.server.ServerGenerateBrokerIdTest > testDisableGeneratedBrokerId STARTED

kafka.server.ServerGenerateBrokerIdTest > testDisableGeneratedBrokerId PASSED

kafka.server.ServerGenerateBrokerIdTest > testUserConfigAndGeneratedBrokerId 
STARTED

kafka.server.ServerGenerateBrokerIdTest > testUserConfigAndGeneratedBrokerId 
PASSED

kafka.server.ServerGenerateBrokerIdTest > 
testConsistentBrokerIdFromUserConfigAndMetaProps STARTED

kafka.server.ServerGenerateBrokerIdTest > 
testConsistentBrokerIdFromUserConfigAndMetaProps PASSED

kafka.server.AdvertiseBrokerTest > testBrokerAdvertiseHostNameAndPortToZK 
STARTED

kafka.server.AdvertiseBrokerTest > testBrokerAdvertiseHostNameAndPortToZK PASSED

kafka.server.CreateTopicsRequestWithPolicyTest > testValidCreateTopicsRequests 
STARTED

kafka.server.CreateTopicsRequestWithPolicyTest > testValidCreateTopicsRequests 
PASSED

kafka.server.CreateTopicsRequestWithPolicyTest > testErrorCreateTopicsRequests 
STARTED

kafka.server.CreateTopicsRequestWithPolicyTest > testErrorCreateTopicsRequests 
PASSED

kafka.server.SaslPlaintextReplicaFetchTest > testReplicaFetcherThread STARTED

kafka.server.SaslPlaintextReplicaFetchTest > testReplicaFetcherThread PASSED

kafka.server.SaslApiVersionsRequestTest > 
testApiVersionsRequestWithUnsupportedVersion STARTED

kafka.server.SaslApiVersionsRequestTest > 
testApiVersionsRequestWithUnsupportedVersion PASSED

kafka.server.SaslApiVersionsRequestTest > 
testApiVersionsRequestBeforeSaslHandshakeRequest STARTED

kafka.server.SaslApiVersionsRequestTest > 
testApiVersionsRequestBeforeSaslHandshakeRequest PASSED

kafka.server.SaslApiVersionsRequestTest > 
testApiVersionsRequestAfterSaslHandshakeRequest STARTED

kafka.server.SaslApiVersionsRequestTest > 
testApiVersionsRequestAfterSaslHandshakeRequest PASSED

kafka.server.ServerGenerateClusterIdTest > 
testAutoGenerateClusterIdForKafkaClusterParallel STARTED

kafka.server.ServerGenerateClusterIdTest > 
testAutoGenerateClusterIdForKafkaClusterParallel PASSED

kafka.server.ServerGenerateClusterIdTest > testAutoGenerateClusterId STARTED

kafka.server.ServerGenerateClusterIdTest > testAutoGenerateClusterId PASSED

kafka.server.ServerGenerateClusterIdTest > 
testAutoGenerateClusterIdForKafkaClusterSequential STARTED


Build failed in Jenkins: kafka-trunk-jdk8 #1472

2017-05-02 Thread Apache Jenkins Server
See 


Changes:

[ismael] KAFKA-5131; WriteTxnMarkers and complete commit/abort on partition

--
[...truncated 846.70 KB...]
kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithCollision PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.PrimitiveApiTest > testMultiProduce STARTED

kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch STARTED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize 
STARTED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests STARTED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch STARTED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression STARTED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic STARTED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest STARTED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete STARTED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
STARTED


[GitHub] kafka pull request #2953: handle 0 futures in all()

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2953


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-136: Add Listener name and Security Protocol name to SelectorMetrics tags

2017-05-02 Thread Jun Rao
Hi, Edo,

Thanks for the proposal. It looks good to me.

Jun

On Thu, Mar 30, 2017 at 8:51 AM, Edoardo Comar  wrote:

> Hi all,
>
> We created KIP-136: Add Listener name and Security Protocol name to
> SelectorMetrics tags
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 136%3A+Add+Listener+name+and+Security+Protocol+name+to+
> SelectorMetrics+tags
>
> Please help review the KIP. You feedback is appreciated!
>
> cheers,
> Edo
> --
> Edoardo Comar
> IBM MessageHub
> eco...@uk.ibm.com
> IBM UK Ltd, Hursley Park, SO21 2JN
>
> IBM United Kingdom Limited Registered in England and Wales with number
> 741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6
> 3AU
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>


[jira] [Created] (KAFKA-5163) Support replicas movement between log directories (KIP-113)

2017-05-02 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-5163:
---

 Summary: Support replicas movement between log directories 
(KIP-113)
 Key: KAFKA-5163
 URL: https://issues.apache.org/jira/browse/KAFKA-5163
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Dong Lin


See 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-113%3A+Support+replicas+movement+between+log+directories
 for motivation and design.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP 151 - Expose Connector type in REST API

2017-05-02 Thread Ewen Cheslack-Postava
Both the proposal and PR look good to me.

-Ewen

On Tue, May 2, 2017 at 1:48 PM, dan  wrote:

> hello.
>
> in an attempt to make the connect rest endpoints more useful i'd like to
> add the Connector type (Sink/Source) in our rest endpoints to make them
> more self descriptive.
>
> KIP here:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 151+Expose+Connector+type+in+REST+API
> initial pr: https://github.com/apache/kafka/pull/2960
>
> thanks
> dan
>


[jira] [Assigned] (KAFKA-5161) reassign-partitions to check if broker of ID exists in cluster

2017-05-02 Thread huxi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huxi reassigned KAFKA-5161:
---

Assignee: huxi

> reassign-partitions to check if broker of ID exists in cluster
> --
>
> Key: KAFKA-5161
> URL: https://issues.apache.org/jira/browse/KAFKA-5161
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.10.1.1
> Environment: Debian 8
>Reporter: Lawrence Weikum
>Assignee: huxi
>Priority: Minor
>
> A topic was created with only one replica. We wanted to increase it later to 
> 3 replicas. A JSON file was created, but the IDs for the brokers were 
> incorrect and not part of the system. 
> The script or the brokers receiving the reassignment command should first 
> check if the new IDs exist in the cluster first and then continue, throwing 
> an error to the user if there is one that doesn't.
> The current effect of assign partitions to non-existant brokers is a stuck 
> replication assignment with no way to stop it. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2963: Minor: Removed use of keySerde in CachingSessionSt...

2017-05-02 Thread KyleWinkelman
GitHub user KyleWinkelman opened a pull request:

https://github.com/apache/kafka/pull/2963

Minor: Removed use of keySerde in CachingSessionStore.

CachingSessionStore wasn't properly using the default keySerde if no Serde 
was supplied. I saw the below error in the logs for one of my test cases.

ERROR stream-thread 
[cogroup-integration-test-3-5570fe48-d2a3-4271-80b1-81962295553d-StreamThread-6]
 Streams application error during processing:  
(org.apache.kafka.streams.processor.internals.StreamThread:335)
java.lang.NullPointerException
at 
org.apache.kafka.streams.state.internals.CachingSessionStore.findSessions(CachingSessionStore.java:93)
at 
org.apache.kafka.streams.kstream.internals.KStreamSessionWindowAggregate$KStreamSessionWindowAggregateProcessor.process(KStreamSessionWindowAggregate.java:94)
at 
org.apache.kafka.streams.processor.internals.ProcessorNode$1.run(ProcessorNode.java:47)
at 
org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:187)
at 
org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:133)
at 
org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:82)
at 
org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:69)
at 
org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:206)
at 
org.apache.kafka.streams.processor.internals.StreamThread.processAndPunctuate(StreamThread.java:657)
at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:728)
at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:327)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/KyleWinkelman/kafka 
CachingSessionStore-fix-keySerde-use

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2963.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2963


commit 41592448a401434df475a1dfb2ab2c4d2b6a0508
Author: Kyle Winkelman 
Date:   2017-05-03T01:52:05Z

Removed use of keySerde in CachingSessionStore and substituted serdes
which will contain the default keySerde if one is not supplied.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (KAFKA-5155) Messages can be deleted prematurely when some producers use timestamps and some not

2017-05-02 Thread huxi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994177#comment-15994177
 ] 

huxi edited comment on KAFKA-5155 at 5/3/17 2:27 AM:
-

This is very similar with a jira 
issue([kafka-4398|https://issues.apache.org/jira/browse/KAFKA-4398]) reported 
by me complaining of the fact that broker side cannot honor the order of 
timestamp.   
Sounds like you cannot mix up the new timestamps and old timestamps based on 
the current design.


was (Author: huxi_2b):
This is very similar with a jira 
issue([kafka-4398|https://issues.apache.org/jira/browse/KAFKA-4398]) reported 
by me complaining of the fact that Kafka cannot broker side cannot honor the 
order of timestamp.   
Sounds like you cannot mix up the new timestamps and old timestamps based on 
the current design.

> Messages can be deleted prematurely when some producers use timestamps and 
> some not
> ---
>
> Key: KAFKA-5155
> URL: https://issues.apache.org/jira/browse/KAFKA-5155
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.2.0
>Reporter: Petr Plavjaník
>
> Some messages can be deleted prematurely and never read in following 
> scenario. A producer uses timestamps and produces messages that are appended 
> to the beginning of a log segment. Other producer produces messages without a 
> timestamp. In that case the largest timestamp is made by the old messages 
> with a timestamp and new messages with the timestamp does not influence and 
> the log segment with old and new messages can be delete immediately after the 
> last new message with no timestamp is appended. When all appended messages 
> have no timestamp, then they are not deleted because {{lastModified}} 
> attribute of a {{LogSegment}} is used.
> New test case to {{kafka.log.LogTest}} that fails:
> {code}
>   @Test
>   def 
> shouldNotDeleteTimeBasedSegmentsWhenTimestampIsNotProvidedForSomeMessages() {
> val retentionMs = 1000
> val old = TestUtils.singletonRecords("test".getBytes, timestamp = 0)
> val set = TestUtils.singletonRecords("test".getBytes, timestamp = -1, 
> magicValue = 0)
> val log = createLog(set.sizeInBytes, retentionMs = retentionMs)
> // append some messages to create some segments
> log.append(old)
> for (_ <- 0 until 12)
>   log.append(set)
> assertEquals("No segment should be deleted", 0, log.deleteOldSegments())
>   }
> {code}
> It can be prevented by using {{def largestTimestamp = 
> Math.max(maxTimestampSoFar, lastModified)}} in LogSegment, or by using 
> current timestamp when messages with timestamp {{-1}} are appended.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5130) Change InterBrokerSendThread to use a Queue per broker

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994187#comment-15994187
 ] 

ASF GitHub Bot commented on KAFKA-5130:
---

GitHub user guozhangwang opened a pull request:

https://github.com/apache/kafka/pull/2964

KAFKA-5130: Refactor TC In-memory Cache

1. Collapsed the `ownedPartitions`, `pendingTxnMap` and the 
`transactionMetadataCache` into a single in-memory structure, which is a 
two-layered map: first keyed by the transactionTxnLog, and then valued with the 
current coordinatorEpoch of that map plus another map keyed by the 
transactional id.

2. Use `transactionalId` across the modules in transactional coordinator, 
attach this id with the transactional marker entries.

3. Use two keys: `transactionalId` and `txnLogPartitionId` in the 
writeMarkerPurgatory as well as passing it along with the TxnMarkerEntry, so 
that `TransactionMarkerRequestCompletionHandler` can use it to access the 
two-layered map upon getting responses.

4. Use one queue per `broker-id` and `txnLogPartitionId`. Also when there 
is a possible update on the end point associated with the `broker-id`, update 
the Node without clearing the queue but relying on the requests to retry in the 
next round.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guozhangwang/kafka 
K5130-refactor-tc-inmemory-cache

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2964.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2964


commit 758a74c835a6fb32ae4c142b276699820644f068
Author: Guozhang Wang 
Date:   2017-05-02T18:09:35Z

collapse ownedPartition with MetadataCache

commit 3495d13e69123d6097c4f05bcae5f2bdcf7ccff5
Author: Guozhang Wang 
Date:   2017-05-03T01:21:57Z

centralize callbacks for send markers and append to log

commit c6548003d7c835857a226e57810b104a8501645e
Author: Guozhang Wang 
Date:   2017-05-03T01:39:09Z

minor refactoring

commit 218a572e1c612bad5f9a15d90d37b7c6c2b29d99
Author: Guozhang Wang 
Date:   2017-05-03T02:18:42Z

rebase from trunk

commit 19f033e69091c5a656a473374ea6031d01fcd061
Author: Guozhang Wang 
Date:   2017-05-03T02:32:41Z

minor fixes




> Change InterBrokerSendThread to use a Queue per broker
> --
>
> Key: KAFKA-5130
> URL: https://issues.apache.org/jira/browse/KAFKA-5130
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Damian Guy
>
> Change the {{InterBrokerSendThread}} to use a queue per broker and only 
> attempt to send to brokers that are ready



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk8 #1473

2017-05-02 Thread Apache Jenkins Server
See 


Changes:

[ismael] KAFKA-5131; WriteTxnMarkers and complete commit/abort on partition

[ismael] MINOR: Serialize the real isolationLevel in FetchRequest

--
[...truncated 1.64 MB...]
kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithCollision PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.PrimitiveApiTest > testMultiProduce STARTED

kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch STARTED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize 
STARTED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests STARTED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch STARTED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression STARTED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic STARTED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest STARTED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete STARTED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED


Re: [DISCUSS] KIP 145 - Expose Record Headers in Kafka Connect

2017-05-02 Thread Ewen Cheslack-Postava
Michael,

Aren't JMS headers an example where the variety is a problem? Unless I'm
misunderstanding, there's not even a fixed serialization format expected
for them since JMS defines the runtime types, not the wire format. For
example, we have JMSCorrelationID (String), JMSExpires (Long), and
JMSReplyTo (Destination). These are simply run time types, so we'd need
either (a) a different serializer/deserializer for each or (b) a
serializer/deserializer that can handle all of them (e.g. Avro, JSON, etc).

What is the actual serialized format of the different fields? And if it's
not specified anywhere in the KIP, why should using the well-known type for
the header key (e.g. use StringSerializer, IntSerializer, etc) be better or
worse than using a general serialization format (e.g. Avro, JSON)? And if
the latter is the choice, how do you decide on the format?

-Ewen

On Tue, May 2, 2017 at 12:48 PM, Michael André Pearce <
michael.andre.pea...@me.com> wrote:

> Hi Ewan,
>
> So on the point of JMS the predefined/standardised JMS and JMSX headers
> have predefined types. So these can be serialised/deserialised accordingly.
>
> Custom jms headers agreed could be a bit more difficult but on the 80/20
> rule I would agree mostly they're string values and as anyhow you can hold
> bytes as a string it wouldn't cause any issue, defaulting to that.
>
> But I think easily we maybe able to do one better.
>
> Obviously can override the/config the headers converter but we can supply
> a default converter could take a config file with key to type mapping?
>
> Allowing people to maybe define/declare a header key with the expected
> type in some property file? To support string, byte[] and primitives? And
> undefined headers just either default to String or byte[]
>
> We could also pre define known headers like the jms ones mentioned above.
>
> E.g
>
> AwesomeHeader1=boolean
> AwesomeHeader2=long
> JMSCorrelationId=String
> JMSXGroupId=String
>
>
> What you think?
>
>
> Cheers
> Mike
>
>
>
>
>
>
> Sent from my iPhone
>
> > On 2 May 2017, at 18:45, Ewen Cheslack-Postava 
> wrote:
> >
> > A couple of thoughts:
> >
> > First, agreed that we definitely want to expose header functionality.
> Thank
> > you Mike for starting the conversation! Even if Connect doesn't do
> anything
> > special with it, there's value in being able to access/set headers.
> >
> > On motivation -- I think there are much broader use cases. When thinking
> > about exposing headers, I'd actually use Replicator as only a minor
> > supporting case. The reason is that it is a very uncommon case where
> there
> > is zero impedance mismatch between the source and sink of the data since
> > they are both Kafka. This means you don't need to think much about data
> > formats/serialization. I think the JMS use case is a better example since
> > JMS headers and Kafka headers don't quite match up. Here's a quick list
> of
> > use cases I can think of off the top of my head:
> >
> > 1. Include headers from other systems that support them: JMS (or really
> any
> > MQ), HTTP
> > 2. Other connector-specific headers. For example, from JDBC maybe the
> table
> > the data comes from is a header; for a CDC connector you might include
> the
> > binlog offset as a header.
> > 3. Interceptor/SMT-style use cases for annotating things like provenance
> of
> > data:
> > 3a. Generically w/ user-supplied data like data center, host, app ID,
> etc.
> > 3b. Kafka Connect framework level info, such as the connector/task
> > generating the data
> >
> > On deviation from Connect's model -- to be honest, the KIP-82 also
> deviates
> > quite substantially from how Kafka handles data already, so we may
> struggle
> > a bit to rectify the two. (In particular, headers specify some structure
> > and enforce strings specifically for header keys, but then require you to
> > do serialization of header values yourself...).
> >
> > I think the use cases I mentioned above may also need different
> approaches
> > to how the data in headers are handled. As Gwen mentions, if we expose
> the
> > headers to Connectors, they need to have some idea of the format and the
> > reason for byte[] values in KIP-82 is to leave that decision up to the
> > organization using them. But without knowing the format, connectors can't
> > really do anything with them -- if a source connector assumes a format,
> > they may generate data incompatible with the format used by the rest of
> the
> > organization. On the other hand, I have a feeling most people will just
> use
> >  headers, so allowing connectors to embed arbitrarily
> > complex data may not work out well in practice. Or maybe we leave it
> > flexible, most people default to using StringConverter for the serializer
> > and Connectors will end up defaulting to that just for compatibility...
> >
> > I'm not sure I have a real proposal yet, but I do think understanding the
> > impact of using a Converter for headers would be useful, 

Re: [DISCUSS] KIP-146: Classloading Isolation in Connect

2017-05-02 Thread Stephane Maarek
Thanks for the work, it’s definitely needed!
I’d like to suggest to take it one step further.

To me, I’d like to see Kafka Connect the same way we have Docker and Docker 
repositories.

Here’s how I would envision the flow:
- Kafka Connect workers are just workers. They come with no jars whatsoever
- The REST API allow you to add a config to the connect cluster 
- The workers, seeing the config, pull the jars from the available (maven?) 
repositories (public or private)
- Classpath isolation comes into play so that the pulled jar doesn’t interact 
with other connectors
- Additionally, I believe the config should have a “tag” or “version” (like 
docker really), so that 
o you can run different versions of the same connector on your connect cluster
o configurations are strongly linked to a connector (right now if I update my 
connector jars, I may break my configuration)

I know this is a bit out of scope, but if major changes are coming to connect 
then these are my two cents.

Finally, maybe extend that construct to Transformers. The ability to 
externalise transformers as jars would democratize their usage IMO


On 3/5/17, 4:24 am, "Ewen Cheslack-Postava"  wrote:

Thanks for the KIP.

A few responses inline, followed by additional comments.

On Mon, May 1, 2017 at 9:50 PM, Konstantine Karantasis <
konstant...@confluent.io> wrote:

> Gwen, Randall thank you for your very insightful observations. I'm glad 
you
> find this first draft to be an adequate platform for discussion.
>
> I'll attempt replying to your comments in order.
>
> Gwen, I also debated exactly the same two options: a) interpreting absence
> of module path as a user's intention to turn off isolation and b)
> explicitly using an additional boolean property. A few reasons why I went
> with b) in this first draft are:
> 1) As Randall mentions, to leave the option of using a default value open.
> If not immediately in the first version of isolation, maybe in the future.
> 2) I didn't like the implicit character of the choice of interpreting an
> empty string as a clear intention to turn isolation off by the user. Half
> the time could be just that users forget to set a location, although 
they'd
> like to use class loading isolation.
> 3) There's a slim possibility that in rare occasions a user might want to
> avoid even the slightest increase in memory consumption due to class
> loading duplication. I admit this should be very rare, but given the other
> concerns and that we would really like to keep the isolation 
implementation
> simple, the option to turn off this feature by using only one additional
> config property might not seem too excessive. At least at the start of 
this
> discussion.
> 4) Debugging during development might be simpler in some cases.
> 5) Finally, as you mention, this could allow for smoother upgrades.
>

I'm not sure any of these keep you from removing the extra config. Is there
any reason you couldn't have clean support for relying on the CLASSPATH
while still supporting the classloaders? Then getting people onto the new
classloaders does require documentation for how to install connectors, but
that's pretty minimal. And we don't break existing installations where
people are just adding to the CLASSPATH. It seems like this:

1. Allows you to set a default. Isolation is always enabled, but we won't
include any paths/directories we already use. Setting a default just
requires specifying a new location where we'd hold these directories.
2. It doesn't require the implicit choice -- you actually never turn off
isolation, but still support the regular CLASSPATH with an empty list of
isolated loaders
3. The user can still use CLASSPATH if they want to minimize classloader
overhead
4. Debugging can still use CLASSPATH
5. Upgrades just work.


>
> Randall, regarding your comments:
> 1) To keep its focus narrow, this KIP, as well as the first implementation
> of isolation in Connect, assume filesystem based discovery. With careful
> implementation, transitioning to discovery schemes that support broader
> URIs I believe should be easy in the future.
>

Maybe just mention a couple of quick examples in the KIP. When described
inline it might be more obvious that it will extend cleanly.


> 2) The example you give makes a good point. However I'm inclined to say
> that such cases should be addressed more as exceptions rather than as 
being
> the common case. Therefore, I wouldn't see all dependencies imported by 
the
> framework as required to be filtered out, because in that case we lose the
> advantage of isolation between the framework and the connectors (and we 
are
> left only with isolation between connectors).

3) I 

Jenkins build is back to normal : kafka-trunk-jdk7 #2141

2017-05-02 Thread Apache Jenkins Server
See 




[jira] [Commented] (KAFKA-4667) Connect should create internal topics

2017-05-02 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994268#comment-15994268
 ] 

Ewen Cheslack-Postava commented on KAFKA-4667:
--

[~rhauch] Discussed some of these issues offline, but there are some other 
refinements here too:

* based on KIP-115, default replication factor of 3 but override it in our demo 
configs to make running on a single local node easy is probably the least 
likely to cause unexpected problems
* partitions = 1 is only really true for config topic where ordering between 
differing keys is required. For status and offsets topics you can go higher, 
and Confluent recommends doing so: 
http://docs.confluent.io/current/connect/userguide.html#distributed-mode
* Not sure if we should fiddle with unclean leader election defaults. But maybe 
it is ok to set this to false and tell users if they want the unsafe version 
they need to set up the topic themselves before starting Connect? Not sure what 
the right tradeoff of respecting users' existing settings vs encouraging better 
behavior is. This might even differ between the three topics -- status topics 
are way less critical than config topic.

> Connect should create internal topics
> -
>
> Key: KAFKA-4667
> URL: https://issues.apache.org/jira/browse/KAFKA-4667
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Emanuele Cesena
>Priority: Critical
> Fix For: 0.11.0.0
>
>
> I'm reporting this as an issue but in fact it requires more investigation 
> (which unfortunately I'm not able to perform at this time).
> Repro steps:
> - configure Kafka for consistency, for example:
> default.replication.factor=3
> min.insync.replicas=2
> unclean.leader.election.enable=false
> - run Connect for the first time, which should create its internal topics
> I believe these topics are created with the broker's default, in particular:
> min.insync.replicas=2
> unclean.leader.election.enable=false
> but connect doesn't produce with acks=all, which in turn may cause the 
> cluster to go in a bad state (see, e.g., 
> https://issues.apache.org/jira/browse/KAFKA-4666).
> Solution would be to force availability mode, i.e. force:
> unclean.leader.election.enable=true
> when creating the connect topics, or viceversa detect availability vs 
> consistency mode and turn acks=all if needed.
> I assume the same happens with other kafka-based services such as streams.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk8 #1474

2017-05-02 Thread Apache Jenkins Server
See 


Changes:

[ismael] MINOR: Handle 0 futures in all()

--
[...truncated 842.54 KB...]
kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithCollision PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.PrimitiveApiTest > testMultiProduce STARTED

kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch STARTED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize 
STARTED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests STARTED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch STARTED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression STARTED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic STARTED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest STARTED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete STARTED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
STARTED


[jira] [Commented] (KAFKA-3754) Kafka default -Xloggc settings should include GC log rotation flags

2017-05-02 Thread James Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994280#comment-15994280
 ] 

James Cheng commented on KAFKA-3754:


FYI, I tried out this change in my cluster today. The JVM's GC log rotation 
isn't what I expected:
{code}
$ ls -ltr --full-time kafkaServer-gc.log*
-rw-r--r--. 1 root root 2829636 2017-05-03 00:48:26.0 + 
kafkaServer-gc.log
-rw-r--r--. 1 root root  103794 2017-05-03 00:57:01.0 + 
kafkaServer-gc.log.0
-rw-r--r--. 1 root root   14939 2017-05-03 00:59:59.0 + 
kafkaServer-gc.log.1.current
{code}

It writes to kafkaServer-gc.log, then .log.1, then log.2, etc, until it reaches 
10 (the number I configured) and then wraps back around to kafkaServer-gc.log.

I had expected that "kafkaServer-gc.log" would always be the current one, and 
then the .1 .2 .3 etc would be progressively older ones.

Just thought I'd share that info, in case others are interested. But, it *does* 
rotate the logs, and so will limit disk usage.


> Kafka default -Xloggc settings should include GC log rotation flags
> ---
>
> Key: KAFKA-3754
> URL: https://issues.apache.org/jira/browse/KAFKA-3754
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ryan P
>Assignee: Ryan P
>Priority: Minor
>
> By default kafka-run-class.sh defines it's GC settings like so:
> KAFKA_GC_LOG_OPTS="-Xloggc:$LOG_DIR/$GC_LOG_FILE_NAME -verbose:gc 
> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps "
> This has to potential to generate incredibly large log files when left 
> unmanaged. 
> Instead it should include some sort of default GC log retention policy by 
> adding the following flags:
> -XX:+UseGCLogFileRotation 
> -XX:NumberOfGCLogFiles= 
>  -XX:GCLogFileSize=
> http://www.oracle.com/technetwork/java/javase/7u2-relnotes-1394228.html 
> details these flags and their defaults



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2962: Kafka5161 reassign-partitions to check if broker o...

2017-05-02 Thread amethystic
GitHub user amethystic opened a pull request:

https://github.com/apache/kafka/pull/2962

Kafka5161 reassign-partitions to check if broker of ID exists in cluster

Added code to check existence of the brokers in the proposed plan.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/amethystic/kafka 
kafka5161_reassign_check_invalid_brokerID

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2962.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2962


commit 522e5d5ea05016954ae05452c500130b2722c7ab
Author: amethystic 
Date:   2017-05-03T01:54:23Z

kafka-5161: reassign-partitions to check if broker of ID exists in cluster

Added code for non-existent brokerID checking in the proposed assignemnt 
plan.

commit c713728103eba20ff701b5c6c172086cdf23f460
Author: amethystic 
Date:   2017-05-03T01:56:36Z

kafka-5161: reassign-partitions to check if broker of ID exists in cluster

Added code to check brokerID existence in the proposed assignment plan.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5155) Messages can be deleted prematurely when some producers use timestamps and some not

2017-05-02 Thread huxi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994177#comment-15994177
 ] 

huxi commented on KAFKA-5155:
-

This is very similar with a jira 
issue([kafka-4398|https://issues.apache.org/jira/browse/KAFKA-4398]) reported 
by me complaining of the fact that Kafka cannot broker side cannot honor the 
order of timestamp.   
Sounds like you cannot mix up the new timestamps and old timestamps based on 
the current design.

> Messages can be deleted prematurely when some producers use timestamps and 
> some not
> ---
>
> Key: KAFKA-5155
> URL: https://issues.apache.org/jira/browse/KAFKA-5155
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.2.0
>Reporter: Petr Plavjaník
>
> Some messages can be deleted prematurely and never read in following 
> scenario. A producer uses timestamps and produces messages that are appended 
> to the beginning of a log segment. Other producer produces messages without a 
> timestamp. In that case the largest timestamp is made by the old messages 
> with a timestamp and new messages with the timestamp does not influence and 
> the log segment with old and new messages can be delete immediately after the 
> last new message with no timestamp is appended. When all appended messages 
> have no timestamp, then they are not deleted because {{lastModified}} 
> attribute of a {{LogSegment}} is used.
> New test case to {{kafka.log.LogTest}} that fails:
> {code}
>   @Test
>   def 
> shouldNotDeleteTimeBasedSegmentsWhenTimestampIsNotProvidedForSomeMessages() {
> val retentionMs = 1000
> val old = TestUtils.singletonRecords("test".getBytes, timestamp = 0)
> val set = TestUtils.singletonRecords("test".getBytes, timestamp = -1, 
> magicValue = 0)
> val log = createLog(set.sizeInBytes, retentionMs = retentionMs)
> // append some messages to create some segments
> log.append(old)
> for (_ <- 0 until 12)
>   log.append(set)
> assertEquals("No segment should be deleted", 0, log.deleteOldSegments())
>   }
> {code}
> It can be prevented by using {{def largestTimestamp = 
> Math.max(maxTimestampSoFar, lastModified)}} in LogSegment, or by using 
> current timestamp when messages with timestamp {{-1}} are appended.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5153) KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting

2017-05-02 Thread huxi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994197#comment-15994197
 ] 

huxi commented on KAFKA-5153:
-

Could you set `replica.lag.time.max.ms`, `replica.fetch.wait.max.ms` to larger 
values and retry? 

> KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting
> ---
>
> Key: KAFKA-5153
> URL: https://issues.apache.org/jira/browse/KAFKA-5153
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
> Environment: RHEL 6
> Java Version  1.8.0_91-b14
>Reporter: Arpan
>Priority: Critical
> Attachments: server_1_72server.log, server_2_73_server.log, 
> server_3_74Server.log, server.properties, ThreadDump_1493564142.dump, 
> ThreadDump_1493564177.dump, ThreadDump_1493564249.dump
>
>
> Hi Team, 
> I was earlier referring to issue KAFKA-4477 because the problem i am facing 
> is similar. I tried to search the same reference in release docs as well but 
> did not get anything in 0.10.1.1 or 0.10.2.0. I am currently using 
> 2.11_0.10.2.0.
> I am have 3 node cluster for KAFKA and cluster for ZK as well on the same set 
> of servers in cluster mode. We are having around 240GB of data getting 
> transferred through KAFKA everyday. What we are observing is disconnect of 
> the server from cluster and ISR getting reduced and it starts impacting 
> service.
> I have also observed file descriptor count getting increased a bit, in normal 
> circumstances we have not observed FD count more than 500 but when issue 
> started we were observing it in the range of 650-700 on all 3 servers. 
> Attaching thread dumps of all 3 servers when we started facing the issue 
> recently.
> The issue get vanished once you bounce the nodes and the set up is not 
> working more than 5 days without this issue. Attaching server logs as well.
> Kindly let me know if you need any additional information. Attaching 
> server.properties as well for one of the server (It's similar on all 3 
> serversP)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS]: KIP-149: Enabling key access in ValueTransformer, ValueMapper, and ValueJoiner

2017-05-02 Thread Mathieu Fenniak
Hi Jeyhun,

This approach would change ValueMapper (...etc) to be classes, rather than
interfaces, which is also a backwards incompatible change.  An alternative
approach that would be backwards compatible would be to define new
interfaces, and provide overrides where those interfaces are used.

Unfortunately, making the key parameter as "final" doesn't change much
about guarding against key change.  It only prevents the parameter variable
from being reassigned.  If the key type is a mutable object (eg. byte[]),
it can still be mutated. (eg. key[0] = 0).  But I'm not really sure there's
much that can be done about that.

Mathieu


On Mon, May 1, 2017 at 5:39 PM, Jeyhun Karimov  wrote:

> Thanks for comments.
>
> The concerns makes sense. Although we can guard for immutable keys in
> current implementation (with few changes), I didn't consider backward
> compatibility.
>
> In this case 2 solutions come to my mind. In both cases, user accesses the
> key in Object type, as passing extra type parameter will break
> backwards-compatibility.  So user has to cast to actual key type.
>
> 1. Firstly, We can overload apply method with 2 argument (key and value)
> and force key to be *final*. By doing this,  I think we can address both
> backward-compatibility and guarding against key change.
>
> 2. Secondly, we can create class KeyAccess like:
>
> public class KeyAccess {
> Object key;
> public void beforeApply(final Object key) {
> this.key = key;
> }
> public Object getKey() {
> return key;
> }
> }
>
> We can extend *ValueMapper, ValueJoiner* and *ValueTransformer* from
> *KeyAccess*. Inside processor (for example *KTableMapValuesProcessor*)
> before calling *mapper.apply(value)* we can set the *key* by
> *mapper.beforeApply(key)*. As a result, user can use *getKey()* to access
> the key inside *apply(value)* method.
>
>
> Cheers,
> Jeyhun
>
>
>
>
> On Mon, May 1, 2017 at 7:24 PM Matthias J. Sax 
> wrote:
>
> > Jeyhun,
> >
> > thanks a lot for the KIP!
> >
> > I think there are two issues we need to address:
> >
> > (1) The KIP does not consider backward compatibility. Users did complain
> > about this in past releases already, and as the user base grows, we
> > should not break backward compatibility in future releases anymore.
> > Thus, we should think of a better way to allow key access.
> >
> > Mathieu's comment goes into the same direction
> >
> > >> On the other hand, the number of compile failures that would need to
> be
> > >> fixed from this change is unfortunate. :-)
> >
> > (2) Another concern is, that there is no guard to prevent user code to
> > modify the key. This might corrupt partitioning if users do alter the
> > key (accidentally -- or users are just not aware that they are not
> > allowed to modify the provided key object) and thus break the
> > application. (This was the original motivation to not provide the key in
> > the first place -- it's guards against modification.)
> >
> >
> > -Matthias
> >
> >
> >
> > On 5/1/17 6:31 AM, Mathieu Fenniak wrote:
> > > Hi Jeyhun,
> > >
> > > I just want to add my voice that, I too, have wished for access to the
> > > record key during a mapValues or similar operation.
> > >
> > > On the other hand, the number of compile failures that would need to be
> > > fixed from this change is unfortunate. :-)  But at least it would all
> be
> > a
> > > pretty clear and easy change.
> > >
> > > Mathieu
> > >
> > >
> > > On Mon, May 1, 2017 at 6:55 AM, Jeyhun Karimov 
> > wrote:
> > >
> > >> Dear community,
> > >>
> > >> I want to share KIP-149 [1] based on issues KAFKA-4218 [2], KAFKA-4726
> > [3],
> > >> KAFKA-3745 [4]. The related PR can be found at [5].
> > >> I would like to get your comments.
> > >>
> > >> [1]
> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >> 149%3A+Enabling+key+access+in+ValueTransformer%2C+
> > >> ValueMapper%2C+and+ValueJoiner
> > >> [2] https://issues.apache.org/jira/browse/KAFKA-4218
> > >> [3] https://issues.apache.org/jira/browse/KAFKA-4726
> > >> [4] https://issues.apache.org/jira/browse/KAFKA-3745
> > >> [5] https://github.com/apache/kafka/pull/2946
> > >>
> > >>
> > >> Cheers,
> > >> Jeyhun
> > >>
> > >>
> > >>
> > >> --
> > >> -Cheers
> > >>
> > >> Jeyhun
> > >>
> > >
> >
> > --
> -Cheers
>
> Jeyhun
>


[GitHub] kafka pull request #2957: KAFKA-5132: abort long running transactions

2017-05-02 Thread dguy
GitHub user dguy opened a pull request:

https://github.com/apache/kafka/pull/2957

KAFKA-5132: abort long running transactions

Abort any ongoing transactions that haven't been touched for longer than 
the transaction timeout

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka kafka-5132

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2957.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2957


commit b680920ec2359b3b13e6134f4141ca5caee82fc6
Author: Damian Guy 
Date:   2017-05-02T15:07:50Z

abort Ongoing transactions that have expired




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5132) Abort long running transactions

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993061#comment-15993061
 ] 

ASF GitHub Bot commented on KAFKA-5132:
---

GitHub user dguy opened a pull request:

https://github.com/apache/kafka/pull/2957

KAFKA-5132: abort long running transactions

Abort any ongoing transactions that haven't been touched for longer than 
the transaction timeout

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka kafka-5132

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2957.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2957


commit b680920ec2359b3b13e6134f4141ca5caee82fc6
Author: Damian Guy 
Date:   2017-05-02T15:07:50Z

abort Ongoing transactions that have expired




> Abort long running transactions
> ---
>
> Key: KAFKA-5132
> URL: https://issues.apache.org/jira/browse/KAFKA-5132
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Damian Guy
>Assignee: Damian Guy
>
> We need to abort any transactions that have been running longer than the txn 
> timeout



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5158) Options for handling exceptions during processing

2017-05-02 Thread Eno Thereska (JIRA)
Eno Thereska created KAFKA-5158:
---

 Summary: Options for handling exceptions during processing
 Key: KAFKA-5158
 URL: https://issues.apache.org/jira/browse/KAFKA-5158
 Project: Kafka
  Issue Type: Sub-task
  Components: streams
Reporter: Eno Thereska


Imagine the app-level processing of a (non-corrupted) record fails (e.g. the 
user attempted to do a RPC to an external system, and this call failed). How 
can you process such failed records in a scalable way? For example, imagine you 
need to implement a retry policy such as "retry with exponential backoff". 
Here, you have the problem that 1. you can't really pause processing a single 
record because this will pause the processing of the full stream (bottleneck!) 
and 2. there is no straight-forward way to "sort" failed records based on their 
"next retry time".



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4743) Add a tool to Reset Consumer Group Offsets

2017-05-02 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-4743:
---
Status: Patch Available  (was: Open)

> Add a tool to Reset Consumer Group Offsets
> --
>
> Key: KAFKA-4743
> URL: https://issues.apache.org/jira/browse/KAFKA-4743
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer, core, tools
>Reporter: Jorge Quilcate
>  Labels: kip
>
> Add an external tool to reset Consumer Group offsets, and achieve rewind over 
> the topics, without changing client-side code.
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4743) Add a tool to Reset Consumer Group Offsets

2017-05-02 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-4743:
---
Fix Version/s: 0.11.0.0

> Add a tool to Reset Consumer Group Offsets
> --
>
> Key: KAFKA-4743
> URL: https://issues.apache.org/jira/browse/KAFKA-4743
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer, core, tools
>Reporter: Jorge Quilcate
>  Labels: kip
> Fix For: 0.11.0.0
>
>
> Add an external tool to reset Consumer Group offsets, and achieve rewind over 
> the topics, without changing client-side code.
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5155) Messages can be deleted prematurely when some producers use timestamps and some not

2017-05-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/KAFKA-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Petr Plavjaník updated KAFKA-5155:
--
Description: 
Some messages can be deleted prematurely and never read in following scenario. 
A producer uses timestamps and produces messages that are appended to the 
beginning of a log segment. Other producer produces messages without a 
timestamp. In that case the largest timestamp is made by the old messages with 
a timestamp and new messages with the timestamp does not influence and the log 
segment with old and new messages can be delete immediately after the last new 
message with no timestamp is appended. When all appended messages have no 
timestamp, then they are not deleted because {{lastModified}} attribute of a 
{{LogSegment}} is used.

New test case to {{kafka.log.LogTest}} that fails:
{code}
  @Test
  def 
shouldNotDeleteTimeBasedSegmentsWhenTimestampIsNotProvidedForSomeMessages() {
val retentionMs = 1000
val old = TestUtils.singletonRecords("test".getBytes, timestamp = 0)
val set = TestUtils.singletonRecords("test".getBytes, timestamp = -1, 
magicValue = 0)
val log = createLog(set.sizeInBytes, retentionMs = retentionMs)

// append some messages to create some segments
log.append(old)
for (_ <- 0 until 12)
  log.append(set)

assertEquals("No segment should be deleted", 0, log.deleteOldSegments())
  }
{code}

It can be prevented by using {{def largestTimestamp = 
Math.max(maxTimestampSoFar, lastModified)}} in LogSegment, or by using current 
timestamp when messages with timestamp {{-1}} are appended.

  was:
Some messages can be deleted prematurely and never read in following scenario. 
A producer uses timestamps and produces messages that are appended to the 
beginning of a log segment. Other producer produces messages without a 
timestamp. In that case the largest timestamp is made by the old messages with 
a timestamp and new messages with the timestamp does not influence and the log 
segment with old and new messages can be delete immediately after the last new 
message with no timestamp is appended. When all appended messages have no 
timestamp, then they are not deleted because {{lastModified}} attribute of a 
{{LogSegment}} is used.

New test case to {{kafka.log.LogTest}} that fails:
{code}
  @Test
  def 
shouldNotDeleteTimeBasedSegmentsWhenTimestampIsNotProvidedForSomeMessages() {
val retentionMs = 1000
val old = TestUtils.singletonRecords("test".getBytes, timestamp = 0)
val set = TestUtils.singletonRecords("test".getBytes, timestamp = -1)
val log = createLog(set.sizeInBytes, retentionMs = retentionMs)

// append some messages to create some segments
log.append(old)
for (_ <- 0 until 14)
  log.append(set)

log.deleteOldSegments()
assertEquals("There should be 3 segments remaining", 3, 
log.numberOfSegments)
  }
{code}

It can be prevented by using {{def largestTimestamp = 
Math.max(maxTimestampSoFar, lastModified)}} in LogSegment, or by using current 
timestamp when messages with timestamp {{-1}} are appended.


> Messages can be deleted prematurely when some producers use timestamps and 
> some not
> ---
>
> Key: KAFKA-5155
> URL: https://issues.apache.org/jira/browse/KAFKA-5155
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.2.0
>Reporter: Petr Plavjaník
>
> Some messages can be deleted prematurely and never read in following 
> scenario. A producer uses timestamps and produces messages that are appended 
> to the beginning of a log segment. Other producer produces messages without a 
> timestamp. In that case the largest timestamp is made by the old messages 
> with a timestamp and new messages with the timestamp does not influence and 
> the log segment with old and new messages can be delete immediately after the 
> last new message with no timestamp is appended. When all appended messages 
> have no timestamp, then they are not deleted because {{lastModified}} 
> attribute of a {{LogSegment}} is used.
> New test case to {{kafka.log.LogTest}} that fails:
> {code}
>   @Test
>   def 
> shouldNotDeleteTimeBasedSegmentsWhenTimestampIsNotProvidedForSomeMessages() {
> val retentionMs = 1000
> val old = TestUtils.singletonRecords("test".getBytes, timestamp = 0)
> val set = TestUtils.singletonRecords("test".getBytes, timestamp = -1, 
> magicValue = 0)
> val log = createLog(set.sizeInBytes, retentionMs = retentionMs)
> // append some messages to create some segments
> log.append(old)
> for (_ <- 0 until 12)
>   log.append(set)
> assertEquals("No segment should be deleted", 0, log.deleteOldSegments())
>   }
> {code}
> It can be prevented by using {{def largestTimestamp = 
> 

[jira] [Created] (KAFKA-5156) Options for handling exceptions in streams

2017-05-02 Thread Eno Thereska (JIRA)
Eno Thereska created KAFKA-5156:
---

 Summary: Options for handling exceptions in streams
 Key: KAFKA-5156
 URL: https://issues.apache.org/jira/browse/KAFKA-5156
 Project: Kafka
  Issue Type: Task
  Components: streams
Affects Versions: 0.10.2.0
Reporter: Eno Thereska
Assignee: Eno Thereska
 Fix For: 0.11.0.0


This is a task around options for handling exceptions in streams. It focuses 
around options for dealing with corrupt data (keep going, stop streams, log, 
retry, etc).




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5006) KeyValueStore.put may throw exception unrelated to the current put attempt

2017-05-02 Thread Eno Thereska (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eno Thereska updated KAFKA-5006:

Issue Type: Sub-task  (was: Bug)
Parent: KAFKA-5156

> KeyValueStore.put may throw exception unrelated to the current put attempt
> --
>
> Key: KAFKA-5006
> URL: https://issues.apache.org/jira/browse/KAFKA-5006
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.0.0, 0.10.1.0, 0.10.2.0
>Reporter: Xavier Léauté
>Assignee: Eno Thereska
>  Labels: user-experience
> Fix For: 0.11.0.0
>
>
> It is possible for {{KeyValueStore.put(K key, V value)}} to throw an 
> exception unrelated to the store in question. Due to [the way that 
> {{RecordCollector.send}} is currently 
> implemented|https://github.com/confluentinc/kafka/blob/3.2.x/streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordCollectorImpl.java#L76]
> the exception thrown would be for any previous record produced by the stream 
> task, possibly for a completely unrelated topic the same task is producing to.
> This can be very confusing for someone attempting to correctly handle 
> exceptions thrown by put(), as they would not be able to add any additional 
> debugging information to understand the operation that caused the problem. 
> Worse, such logging would likely confuse the user, since they might mislead 
> themselves into thinking the changelog record created by calling put() caused 
> the problem.
> Given that there is likely no way for the user to recover from an exception 
> thrown by an unrelated produce request, it is questionable whether we should 
> even try to raise the exception at this level. A short-term fix would be to 
> simply delegate this exception to the uncaught exception handler.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5157) Options for handling corrupt data during deserialization

2017-05-02 Thread Eno Thereska (JIRA)
Eno Thereska created KAFKA-5157:
---

 Summary: Options for handling corrupt data during deserialization
 Key: KAFKA-5157
 URL: https://issues.apache.org/jira/browse/KAFKA-5157
 Project: Kafka
  Issue Type: Sub-task
  Components: streams
Reporter: Eno Thereska


When there is a bad formatted data in the source topics, deserialization will 
throw a runtime exception all the way to the users. And since deserialization 
happens before it was ever processed at the beginning of the topology, today 
there is no ways to handle such errors on the user-app level.
We should consider allowing users to handle such "poison pills" in a 
customizable way.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4422) Drop support for Scala 2.10

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15992994#comment-15992994
 ] 

ASF GitHub Bot commented on KAFKA-4422:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/2956

KAFKA-4422: Drop support for Scala 2.10



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka kafka-4422-drop-support-scala-2.10

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2956.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2956






> Drop support for Scala 2.10
> ---
>
> Key: KAFKA-4422
> URL: https://issues.apache.org/jira/browse/KAFKA-4422
> Project: Kafka
>  Issue Type: Task
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>  Labels: kip
> Fix For: 0.11.0.0
>
>
> Now that Scala 2.12 has been released, we should drop support for Scala 2.10  
> in the next major Kafka version so that we keep the number of supported 
> versions at 2. Since we have to compile and run the tests on each supported 
> version, there is a non-trivial cost from a development and testing 
> perspective.
> The clients library is in Java and we recommend people use the Java clients 
> instead of the Scala ones, so dropping support for Scala 2.10 should have a 
> smaller impact than it would have had in the past. Scala 2.10 was released in 
> January 2013 and support ended in March 2015. 
> Once we drop support for Scala 2.10, we can take advantage of APIs and 
> compiler improvements introduced in Scala 2.11 (introduced in April 2014): 
> http://scala-lang.org/news/2.11.0
> Link to KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-119%3A+Drop+Support+for+Scala+2.10+in+Kafka+0.11



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2956: KAFKA-4422: Drop support for Scala 2.10

2017-05-02 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/2956

KAFKA-4422: Drop support for Scala 2.10



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka kafka-4422-drop-support-scala-2.10

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2956.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2956






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-4422) Drop support for Scala 2.10

2017-05-02 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-4422:
---
Status: Patch Available  (was: Open)

> Drop support for Scala 2.10
> ---
>
> Key: KAFKA-4422
> URL: https://issues.apache.org/jira/browse/KAFKA-4422
> Project: Kafka
>  Issue Type: Task
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>  Labels: kip
> Fix For: 0.11.0.0
>
>
> Now that Scala 2.12 has been released, we should drop support for Scala 2.10  
> in the next major Kafka version so that we keep the number of supported 
> versions at 2. Since we have to compile and run the tests on each supported 
> version, there is a non-trivial cost from a development and testing 
> perspective.
> The clients library is in Java and we recommend people use the Java clients 
> instead of the Scala ones, so dropping support for Scala 2.10 should have a 
> smaller impact than it would have had in the past. Scala 2.10 was released in 
> January 2013 and support ended in March 2015. 
> Once we drop support for Scala 2.10, we can take advantage of APIs and 
> compiler improvements introduced in Scala 2.11 (introduced in April 2014): 
> http://scala-lang.org/news/2.11.0
> Link to KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-119%3A+Drop+Support+for+Scala+2.10+in+Kafka+0.11



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4955) Add network handler thread utilization to request quota calculation

2017-05-02 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993165#comment-15993165
 ] 

Jun Rao commented on KAFKA-4955:


[~rsivaram], now that KAFKA-4954 is committed, should we close this jira too?

> Add network handler thread utilization to request quota calculation
> ---
>
> Key: KAFKA-4955
> URL: https://issues.apache.org/jira/browse/KAFKA-4955
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0
>
>
> Add network thread utilization time to the metrics used for throttling based 
> on CPU utilization for requests. This time will be taken into account for the 
> throttling decision made when request handler thread time is recorded for the 
> subsequent request on the connection.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


RE: Kafka Connect and Partitions

2017-05-02 Thread david.franklin
Hi Gwen/Randall,

I think I've finally understood, more or less, how partitioning relates to 
SourceRecords.

Because I was using the SourceRecord constructor that doesn't provide values 
for key and key schema, the key is null.  The DefaultPartioner appears to map 
null to a constant value rather than round-robin across all of the partitions :(
SourceRecord(Map sourcePartition, Map 
sourceOffset, String topic, Schema valueSchema, Object value)

Another SourceRecord constructor enables the partition to be specified but I'd 
prefer not to use this as I don't want to couple the non-Kafka source side to 
Kafka by making it aware of topic partitions - this would also presumably 
involve coupling it to the Cluster so that the number of partitions in a topic 
can be determined :(
SourceRecord(Map sourcePartition, Map 
sourceOffset, String topic, Integer partition, Schema keySchema, Object key, 
Schema valueSchema, Object value)

Instead, if I use the SourceRecord constructor that also takes arguments for 
the key and key schema (making them take the same values as the value and value 
schema in my application), then the custom partitioner / 
producer.partitioner.class property is not required and the data is distributed 
across the partitions :)
SourceRecord(Map sourcePartition, Map 
sourceOffset, String topic, Integer partition, Schema keySchema, Object key, 
Schema valueSchema, Object value)

Many thanks once again for your guidance.  I think this puzzle is now solved :)
Best wishes,
David

-Original Message-
From: Randall Hauch [mailto:rha...@gmail.com] 
Sent: 28 April 2017 16:08
To: dev@kafka.apache.org
Subject: Re: Kafka Connect and Partitions

The source connector creates SourceRecord object and can set a number of
fields, including the message's key and value, the Kafka topic name and, if
desired, the Kafka topic partition number. If the connector does se the the
topic partition to a non-null value, then that's the partition to which
Kafka Connect will write the message; otherwise, the customer partitioner
(e.g., your custom partitioner) used by the Kafka Connect producer will
choose/compute the partition based purely upon the key and value byte
arrays. Note that if the connector doesn't set the topic partition number
and no special producer partitioner is specified, the default hash-based
Kafka partitioner will be used.

So, the connector can certainly set the topic partition number, and it may
be easier to do it there since the connector actually has the key and
values before they are serialized. But no matter what, the connector is the
only thing that sets the message key in the source record.

BTW, the SourceRecord's "source position" and "source offset" are actually
the connector-defined information about the source and where the connector
has read in that source. Don't confuse these with the topic name or topic
partition number.

Hope that helps.

Randall

On Fri, Apr 28, 2017 at 7:15 AM,  wrote:

> Hi Gwen,
>
> Having added a custom partitioner (via the producer.partitioner.class
> property in worker.properties) that simply randomly selects a partition,
> the data is now written evenly across all the partitions :)
>
> The root of my confusion regarding why the default partitioner writes all
> data to the same partition is that I don't understand how the SourceRecords
> returned in the source task poll() method are used by the partitioner.  The
> data that is passed to the partitioner includes a key Object (which is an
> empty byte array - presumably this is a bad idea!), and a value Object
> (which is a non-empty byte array):
>
> @Override
> public int partition(String topic, Object key, byte[] keyBytes, Object
> value, byte[] valueBytes, Cluster cluster) {
> System.out.println(String.format(
> "### PARTITION key[%s][%s][%d] value[%s][%s][%d]",
> key, key.getClass().getSimpleName(), keyBytes.length,
> value, value.getClass().getSimpleName(),
> valueBytes.length));
>
> =>
> ### PARTITION key[[B@584f599f][byte[]][0] value[[B@73cc0cd8][byte[]][236]
>
> However, I don't understand how the above key and value are derived from
> the SourceRecord attributes which, in my application's case, is as follows:
>
> events.add(new SourceRecord(
> offsetKey(filename),
> offsetValue(++recordIndex),
> topicName,
> Schema.BYTES_SCHEMA,
> line));
> System.out.println(String.format(
> "### PARTITION SourceRecord key[%s] value[%s]
> topic[%s] schema[%s], data[%s]",
> offsetKey(filename), offsetValue(recordIndex),
> topicName, Schema.BYTES_SCHEMA, line));
>
> =>
> ### PARTITION SourceRecord 

Build failed in Jenkins: kafka-trunk-jdk8 #1470

2017-05-02 Thread Apache Jenkins Server
See 


Changes:

[ismael] KAFKA-4623; Default unclean.leader.election.enabled to false (KIP-106)

--
[...truncated 844.92 KB...]
kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithCollision PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.PrimitiveApiTest > testMultiProduce STARTED

kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch STARTED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize 
STARTED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests STARTED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch STARTED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression STARTED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic STARTED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest STARTED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete STARTED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 

[jira] [Updated] (KAFKA-5131) WriteTxnMarkers and complete commit/abort on partition immigration

2017-05-02 Thread Damian Guy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Guy updated KAFKA-5131:
--
Status: Patch Available  (was: Open)

> WriteTxnMarkers and complete commit/abort on partition immigration
> --
>
> Key: KAFKA-5131
> URL: https://issues.apache.org/jira/browse/KAFKA-5131
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Damian Guy
>Assignee: Damian Guy
>
> When partitions immigrate we need to write the txn markers and complete the 
> commit/abort for any transactions in a PrepareXX state



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [VOTE] KIP-137: Enhance TopicCommand --describe to show topics marked for deletion

2017-05-02 Thread Vahid S Hashemian
+1 (non-binding)

Thanks Mickael.

--Vahid




From:   Ismael Juma 
To: dev@kafka.apache.org
Date:   05/02/2017 03:46 AM
Subject:Re: [VOTE] KIP-137: Enhance TopicCommand --describe to 
show topics marked for deletion
Sent by:isma...@gmail.com



Thanks for the KIP, +1 (binding).

Ismael

On Tue, May 2, 2017 at 11:43 AM, Mickael Maison 
wrote:

> Hi,
>
> I'd like to start the vote on KIP-137:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 
137%3A+Enhance+TopicCommand+--describe+to+show+topics+marked+for+deletion
>
> Thanks,
>






[jira] [Assigned] (KAFKA-4623) Change Default unclean.leader.election.enabled from True to False

2017-05-02 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reassigned KAFKA-4623:
--

Assignee: Ismael Juma  (was: Sharad)

> Change Default unclean.leader.election.enabled from True to False
> -
>
> Key: KAFKA-4623
> URL: https://issues.apache.org/jira/browse/KAFKA-4623
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ben Stopford
>Assignee: Ismael Juma
> Fix For: 0.11.0.0
>
>
> See KIP-106
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-106+-+Change+Default+unclean.leader.election.enabled+from+True+to+False



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >