[jira] [Commented] (KAFKA-6933) Broker reports Corrupted index warnings apparently infinitely

2018-06-15 Thread Ismael Juma (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514680#comment-16514680
 ] 

Ismael Juma commented on KAFKA-6933:


[~bons], a clean shutdown should not require recovery on restart. A hard 
shutdown will cause recovery to be run for the segments after the recovery 
point (i.e. the recovery point is written periodically and it's data is 
fsync'd). As part of the recovery of those segments, indices will be rebuilt. 
In addition, if indices are missing or corrupt due to a hard shutdown, they 
will be deleted and rebuilt.

> Broker reports Corrupted index warnings apparently infinitely
> -
>
> Key: KAFKA-6933
> URL: https://issues.apache.org/jira/browse/KAFKA-6933
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.0.1
>Reporter: Franco Bonazza
>Priority: Major
>
> I'm running into a situation where the server logs show continuously the 
> following snippet:
> {noformat}
> [2018-05-23 10:58:56,590] INFO Loading producer state from offset 20601420 
> for partition transaction_r10_updates-6 with message format version 2 
> (kafka.log.Log)
> [2018-05-23 10:58:56,592] INFO Loading producer state from snapshot file 
> '/data/0/kafka-logs/transaction_r10_updates-6/20601420.snapshot' 
> for partition transaction_r10_u
> pdates-6 (kafka.log.ProducerStateManager)
> [2018-05-23 10:58:56,593] INFO Completed load of log 
> transaction_r10_updates-6 with 74 log segments, log start offset 0 and log 
> end offset 20601420 in 5823 ms (kafka.log.Log)
> [2018-05-23 10:58:58,761] WARN Found a corrupted index file due to 
> requirement failed: Corrupt index found, index file 
> (/data/0/kafka-logs/transaction_r10_updates-15/20544956.index) 
> has non-zero size but the last offset is 20544956 which is no larger than the 
> base offset 20544956.}. deleting 
> /data/0/kafka-logs/transaction_r10_updates-15/20544956.timeindex, 
> /data/0/kafka-logs/transaction_r10_updates-15/20544956.index, and 
> /data/0/kafka-logs/transaction_r10_updates-15/20544956.txnindex 
> and rebuilding index... (kafka.log.Log)
> [2018-05-23 10:58:58,763] INFO Loading producer state from snapshot file 
> '/data/0/kafka-logs/transaction_r10_updates-15/20544956.snapshot' 
> for partition transaction_r10_updates-15 (kafka.log.ProducerStateManager)
> [2018-05-23 10:59:02,202] INFO Recovering unflushed segment 20544956 in log 
> transaction_r10_updates-15. (kafka.log.Log){noformat}
> The set up is the following,
> Broker is 1.0.1
> There are mirrors from another cluster using client 0.10.2.1 
> There are kafka streams and other custom consumer / producers using 1.0.0 
> client.
>  
> While is doing this the JVM of the broker is up but it doesn't respond so 
> it's impossible to produce, consume or run any commands.
> If I delete all the index files the WARN turns into an ERROR, which takes a 
> long time (1 day last time I tried) but eventually it goes into a healthy 
> state, then I start the producers and things are still healthy, but when I 
> start the consumers it quickly goes into the original WARN loop, which seems 
> infinite.
>  
> I couldn't find any references to the problem, it seems to be at least 
> missreporting the issue, and perhaps it's not infinite? I let it loop over 
> the WARN for over a day and it never moved past that, and if there was 
> something really wrong with the state maybe it should be reported.
> The log cleaner log showed a few "too many files open" when it originally 
> happened but ulimit has always been set to unlimited so I'm not sure what 
> that error means.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-5857) Excessive heap usage on controller node during reassignment

2018-06-15 Thread Ismael Juma (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514678#comment-16514678
 ] 

Ismael Juma commented on KAFKA-5857:


Has anyone tried this with recent versions? This was improved significantly in 
1.1.0.

> Excessive heap usage on controller node during reassignment
> ---
>
> Key: KAFKA-5857
> URL: https://issues.apache.org/jira/browse/KAFKA-5857
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.11.0.0
> Environment: CentOs 7, Java 1.8
>Reporter: Raoufeh Hashemian
>Priority: Major
>  Labels: reliability
> Fix For: 2.1.0
>
> Attachments: CPU.png, disk_write_x.png, memory.png, 
> reassignment_plan.txt
>
>
> I was trying to expand our kafka cluster of 6 broker nodes to 12 broker 
> nodes. 
> Before expansion, we had a single topic with 960 partitions and a replication 
> factor of 3. So each node had 480 partitions. The size of data in each node 
> was 3TB . 
> To do the expansion, I submitted a partition reassignment plan (see attached 
> file for the current/new assignments). The plan was optimized to minimize 
> data movement and be rack aware. 
> When I submitted the plan, it took approximately 3 hours for moving data from 
> old to new nodes to complete. After that, it started deleting source 
> partitions (I say this based on the number of file descriptors) and 
> rebalancing leaders which has not been successful. Meanwhile, the heap usage 
> in the controller node started to go up with a large slope (along with long 
> GC times) and it took 5 hours for the controller to go out of memory and 
> another controller started to have the same behaviour for another 4 hours. At 
> this time the zookeeper ran out of disk and the service stopped.
> To recover from this condition:
> 1) Removed zk logs to free up disk and restarted all 3 zk nodes
> 2) Deleted /kafka/admin/reassign_partitions node from zk
> 3) Had to do unclean restarts of kafka service on oom controller nodes which 
> took 3 hours to complete  . After this stage there was still 676 under 
> replicated partitions.
> 4) Do a clean restart on all 12 broker nodes.
> After step 4 , number of under replicated nodes went to 0.
> So I was wondering if this memory footprint from controller is expected for 
> 1k partitions ? Did we do sth wrong or it is a bug?
> Attached are some resource usage graph during this 30 hours event and the 
> reassignment plan. I'll try to add log files as well



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7012) Performance issue upgrading to kafka 1.0.1 or 1.1

2018-06-15 Thread Rajini Sivaram (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-7012:
--
 Priority: Critical  (was: Major)
Fix Version/s: 2.0.0

> Performance issue upgrading to kafka 1.0.1 or 1.1
> -
>
> Key: KAFKA-7012
> URL: https://issues.apache.org/jira/browse/KAFKA-7012
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.1.0, 1.0.1
>Reporter: rajadayalan perumalsamy
>Assignee: praveen
>Priority: Critical
>  Labels: regression
> Fix For: 2.0.0
>
> Attachments: Commit-47ee8e954-0607-bufferkeys-nopoll-profile.png, 
> Commit-47ee8e954-0607-memory.png, Commit-47ee8e954-0607-profile.png, 
> Commit-47ee8e954-profile.png, Commit-47ee8e954-profile2.png, 
> Commit-f15cdbc91b-profile.png, Commit-f15cdbc91b-profile2.png
>
>
> We are trying to upgrade kafka cluster from Kafka 0.11.0.1 to Kafka 1.0.1. 
> After upgrading 1 node on the cluster, we notice that network threads use 
> most of the cpu. It is a 3 node cluster with 15k messages/sec on each node. 
> With Kafka 0.11.0.1 typical usage of the servers is around 50 to 60% 
> vcpu(using less than 1 vcpu). After upgrade we are noticing that cpu usage is 
> high depending on the number of network threads used. If networks threads is 
> set to 8, then the cpu usage is around 850%(9 vcpus) and if it is set to 4 
> then the cpu usage is around 450%(5 vcpus). Using the same kafka 
> server.properties for both.
> Did further analysis with git bisect, couple of build and deploys, traced the 
> issue to commit 47ee8e954df62b9a79099e944ec4be29afe046f6. CPU usage is fine 
> for commit f15cdbc91b240e656d9a2aeb6877e94624b21f8d. But with commit 
> 47ee8e954df62b9a79099e944ec4be29afe046f6 cpu usage has increased. Have 
> attached screenshots of profiling done with both the commits. Screenshot 
> Commit-f15cdbc91b-profile shows less cpu usage by network threads and 
> Screenshots Commit-47ee8e954-profile and Commit-47ee8e954-profile2 show 
> higher cpu usage(almost entire cpu usage) by network threads. Also noticed 
> that kafka.network.Processor.poll() method is invoked 10 times more with 
> commit 47ee8e954df62b9a79099e944ec4be29afe046f6.
> We need the issue to be resolved to upgrade the cluster. Please let me know 
> if you need any additional information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7012) Performance issue upgrading to kafka 1.0.1 or 1.1

2018-06-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514484#comment-16514484
 ] 

ASF GitHub Bot commented on KAFKA-7012:
---

rajinisivaram opened a new pull request #5237: KAFKA-7012: Don't process SSL 
channels without data to process
URL: https://github.com/apache/kafka/pull/5237
 
 
   Avoid unnecessary processing of SSL channels when there are some bytes 
buffered, but not enough to make progress.
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Performance issue upgrading to kafka 1.0.1 or 1.1
> -
>
> Key: KAFKA-7012
> URL: https://issues.apache.org/jira/browse/KAFKA-7012
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.1.0, 1.0.1
>Reporter: rajadayalan perumalsamy
>Assignee: praveen
>Priority: Major
>  Labels: regression
> Attachments: Commit-47ee8e954-0607-bufferkeys-nopoll-profile.png, 
> Commit-47ee8e954-0607-memory.png, Commit-47ee8e954-0607-profile.png, 
> Commit-47ee8e954-profile.png, Commit-47ee8e954-profile2.png, 
> Commit-f15cdbc91b-profile.png, Commit-f15cdbc91b-profile2.png
>
>
> We are trying to upgrade kafka cluster from Kafka 0.11.0.1 to Kafka 1.0.1. 
> After upgrading 1 node on the cluster, we notice that network threads use 
> most of the cpu. It is a 3 node cluster with 15k messages/sec on each node. 
> With Kafka 0.11.0.1 typical usage of the servers is around 50 to 60% 
> vcpu(using less than 1 vcpu). After upgrade we are noticing that cpu usage is 
> high depending on the number of network threads used. If networks threads is 
> set to 8, then the cpu usage is around 850%(9 vcpus) and if it is set to 4 
> then the cpu usage is around 450%(5 vcpus). Using the same kafka 
> server.properties for both.
> Did further analysis with git bisect, couple of build and deploys, traced the 
> issue to commit 47ee8e954df62b9a79099e944ec4be29afe046f6. CPU usage is fine 
> for commit f15cdbc91b240e656d9a2aeb6877e94624b21f8d. But with commit 
> 47ee8e954df62b9a79099e944ec4be29afe046f6 cpu usage has increased. Have 
> attached screenshots of profiling done with both the commits. Screenshot 
> Commit-f15cdbc91b-profile shows less cpu usage by network threads and 
> Screenshots Commit-47ee8e954-profile and Commit-47ee8e954-profile2 show 
> higher cpu usage(almost entire cpu usage) by network threads. Also noticed 
> that kafka.network.Processor.poll() method is invoked 10 times more with 
> commit 47ee8e954df62b9a79099e944ec4be29afe046f6.
> We need the issue to be resolved to upgrade the cluster. Please let me know 
> if you need any additional information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7058) ConnectSchema#equals() broken for array-typed default values

2018-06-15 Thread Randall Hauch (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514460#comment-16514460
 ] 

Randall Hauch commented on KAFKA-7058:
--

I recommended that this be merged back as far as possible, including all of the 
branches in between (e.g., 0.10.. That way it will be included in all 
subsequent patch releases, if/when they occur.

The next release (soon!) is going to be 2.0 since AK is removing support for 
JDK 7, and so even though we just released 1.1 we are not planning a 1.2 
release. However, we'll continue to cut patch releases for 1.1.x, 1.0.x, 
0.11.0.x, 0.10.1.x, 0.10.2.x, and even earlier branches as needed. (Does that 
answer your question?)

> ConnectSchema#equals() broken for array-typed default values
> 
>
> Key: KAFKA-7058
> URL: https://issues.apache.org/jira/browse/KAFKA-7058
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.0.0, 1.1.0
>Reporter: Gunnar Morling
>Priority: Major
>
> {{ConnectSchema#equals()}} calls {{Objects#equals()}} for the schemas' 
> default values, but this doesn't work correctly if the default values in fact 
> are arrays. In this case, always {{false}} will be returned, also if the 
> default value arrays actually are the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-7058) ConnectSchema#equals() broken for array-typed default values

2018-06-15 Thread Randall Hauch (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512606#comment-16512606
 ] 

Randall Hauch edited comment on KAFKA-7058 at 6/15/18 10:35 PM:


[~gunnar.morling] thanks for the PR. It looks good, but we're a few days past 
code freeze and even though this is a small change I'm not sure we can call 
this a real blocker. I'm going to approve it, but we'll probably have to wait 
until after the 2.0 release to merge. The good news is that this should 
backport pretty cleanly back to the 0.9 or 0.10 branches, and so it would be in 
the next patch release on the various branches whenever they occur.


was (Author: rhauch):
[~gunnar.morling] thanks for the PR. It looks good, but we're a few days past 
code freeze and even though this is a small change I'm not sure we can call 
this a real blocker. I'm going to approve it, but we'll probably have to wait 
until after the 2.0 release to merge. The good news is that this should 
backport pretty cleanly back to the 0.10 branch, and so it would be in the next 
patch release on the various branches whenever they occur.

> ConnectSchema#equals() broken for array-typed default values
> 
>
> Key: KAFKA-7058
> URL: https://issues.apache.org/jira/browse/KAFKA-7058
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.0.0, 1.1.0
>Reporter: Gunnar Morling
>Priority: Major
>
> {{ConnectSchema#equals()}} calls {{Objects#equals()}} for the schemas' 
> default values, but this doesn't work correctly if the default values in fact 
> are arrays. In this case, always {{false}} will be returned, also if the 
> default value arrays actually are the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7040) The replica fetcher thread may truncate accepted messages during multiple fast leadership transitions

2018-06-15 Thread Anna Povzner (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514458#comment-16514458
 ] 

Anna Povzner commented on KAFKA-7040:
-

The offset should not be lost if min.insync.replicas > 1 and we have clean 
leader election:

Case 1: Message 100 gets replicated to broker 1 (in step 3, as described). When 
broker0 truncates it in step 5 due to old OffsetsForLeaderEpoch response, it 
will start fetching from offset 100, and fetch that same message from broker0.

Case 2: Message 100 does not get replicated to broker1. Suppose, we fix the 
issue and discard the old OffsetsForLeaderEpoch response. Broker0 will send 
OffsetsForLeaderEpoch request with leader epoch 11, Broker1 will respond with 
offset 100, leader epoch 10 (because it never appended any offsets from leader 
epoch 11). Broker0 will truncate the message with offset 100, leader epoch 11, 
to be consistent with the log on broker1. With the reported issue, Broker0 will 
truncated based on old OffsetsForLeaderEpoch response, and basically end up in 
the correct state. 

The message could be lost if min.insync.replicas == 1 or in case of unclean 
leader election, but this is still within the contract. 

Just to clarify, I am not saying we should not fix it, just trying to 
understand the impact and priority. 

 

> The replica fetcher thread may truncate accepted messages during multiple 
> fast leadership transitions
> -
>
> Key: KAFKA-7040
> URL: https://issues.apache.org/jira/browse/KAFKA-7040
> Project: Kafka
>  Issue Type: Bug
>Reporter: Lucas Wang
>Priority: Minor
>
> Problem Statement:
> Consider the scenario where there are two brokers, broker0, and broker1, and 
> there are two partitions "t1p0", and "t1p1"[1], both of which have broker1 as 
> the leader and broker0 as the follower. The following sequence of events 
> happened on broker0
> 1. The replica fetcher thread on a broker0 issues a LeaderEpoch request to 
> broker1, and awaits to get the response
> 2. A LeaderAndISR request causes broker0 to become the leader for one 
> partition t1p0, which in turn will remove the partition t1p0 from the replica 
> fetcher thread
> 3. Broker0 accepts some messages from a producer
> 4. A 2nd LeaderAndISR request causes broker1 to become the leader, and 
> broker0 to become the follower for partition t1p0. This will cause the 
> partition t1p0 to be added back to the replica fetcher thread on broker0.
> 5. The replica fetcher thread on broker0 receives a response for the 
> LeaderEpoch request issued in step 1, and truncates the accepted messages in 
> step3.
> The issue can be reproduced with the test from 
> https://github.com/gitlw/kafka/commit/8956e743f0e432cc05648da08c81fc1167b31bea
> [1] Initially we set up broker0 to be the follower of two partitions instead 
> of just one, to avoid the shutting down of the replica fetcher thread when it 
> becomes idle.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7040) The replica fetcher thread may truncate accepted messages during multiple fast leadership transitions

2018-06-15 Thread Anna Povzner (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514401#comment-16514401
 ] 

Anna Povzner commented on KAFKA-7040:
-

Hi [~luwang],

Did you actually see data loss (or log divergence)? If so, what was your 
producer's acks config and min.insync.replicas?

If I am understanding this correctly, this is not a correctness issue, but 
rather an efficiency issue. In other words, Broker0 may truncate based on the 
OffsetForLeaderEpoch response that is no longer valid, but it will re-fetch the 
truncated messages from the leader (and since this is a fast leader change, 
there should be very few messages to re-fetch). If the message that got 
truncated is not on the new leader, it should be truncated anyway. I see that 
in the example in your last comment, offset 100 got replicated to broker1, so 
this is the case where broker0 may truncate that offset and re-fetch again. 

So, at this moment, I don't see a possibility of losing data (with the 
appropriate configs/settings). However, I agree that it would be useful to 
fence the replica fetcher from processing OffsetsFoLeaderEpoch response that 
arrived before partition was removed and then re-added to the fetcher. 

 

> The replica fetcher thread may truncate accepted messages during multiple 
> fast leadership transitions
> -
>
> Key: KAFKA-7040
> URL: https://issues.apache.org/jira/browse/KAFKA-7040
> Project: Kafka
>  Issue Type: Bug
>Reporter: Lucas Wang
>Priority: Minor
>
> Problem Statement:
> Consider the scenario where there are two brokers, broker0, and broker1, and 
> there are two partitions "t1p0", and "t1p1"[1], both of which have broker1 as 
> the leader and broker0 as the follower. The following sequence of events 
> happened on broker0
> 1. The replica fetcher thread on a broker0 issues a LeaderEpoch request to 
> broker1, and awaits to get the response
> 2. A LeaderAndISR request causes broker0 to become the leader for one 
> partition t1p0, which in turn will remove the partition t1p0 from the replica 
> fetcher thread
> 3. Broker0 accepts some messages from a producer
> 4. A 2nd LeaderAndISR request causes broker1 to become the leader, and 
> broker0 to become the follower for partition t1p0. This will cause the 
> partition t1p0 to be added back to the replica fetcher thread on broker0.
> 5. The replica fetcher thread on broker0 receives a response for the 
> LeaderEpoch request issued in step 1, and truncates the accepted messages in 
> step3.
> The issue can be reproduced with the test from 
> https://github.com/gitlw/kafka/commit/8956e743f0e432cc05648da08c81fc1167b31bea
> [1] Initially we set up broker0 to be the follower of two partitions instead 
> of just one, to avoid the shutting down of the replica fetcher thread when it 
> becomes idle.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7056) Connect's new numeric converters should be in a different package

2018-06-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514397#comment-16514397
 ] 

ASF GitHub Bot commented on KAFKA-7056:
---

ewencp closed pull request #5222: KAFKA-7056: Moved Connect’s new numeric 
converters to runtime (KIP-305)
URL: https://github.com/apache/kafka/pull/5222
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/connect/api/src/main/java/org/apache/kafka/connect/storage/DoubleConverter.java
 
b/connect/runtime/src/main/java/org/apache/kafka/connect/converters/DoubleConverter.java
similarity index 91%
rename from 
connect/api/src/main/java/org/apache/kafka/connect/storage/DoubleConverter.java
rename to 
connect/runtime/src/main/java/org/apache/kafka/connect/converters/DoubleConverter.java
index 04019a7a529..684caa1c658 100644
--- 
a/connect/api/src/main/java/org/apache/kafka/connect/storage/DoubleConverter.java
+++ 
b/connect/runtime/src/main/java/org/apache/kafka/connect/converters/DoubleConverter.java
@@ -14,11 +14,13 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.kafka.connect.storage;
+package org.apache.kafka.connect.converters;
 
 import org.apache.kafka.common.serialization.DoubleDeserializer;
 import org.apache.kafka.common.serialization.DoubleSerializer;
 import org.apache.kafka.connect.data.Schema;
+import org.apache.kafka.connect.storage.Converter;
+import org.apache.kafka.connect.storage.HeaderConverter;
 
 /**
  * {@link Converter} and {@link HeaderConverter} implementation that only 
supports serializing to and deserializing from double values.
diff --git 
a/connect/api/src/main/java/org/apache/kafka/connect/storage/FloatConverter.java
 
b/connect/runtime/src/main/java/org/apache/kafka/connect/converters/FloatConverter.java
similarity index 91%
rename from 
connect/api/src/main/java/org/apache/kafka/connect/storage/FloatConverter.java
rename to 
connect/runtime/src/main/java/org/apache/kafka/connect/converters/FloatConverter.java
index 16bf0e0f93f..3f92b965cec 100644
--- 
a/connect/api/src/main/java/org/apache/kafka/connect/storage/FloatConverter.java
+++ 
b/connect/runtime/src/main/java/org/apache/kafka/connect/converters/FloatConverter.java
@@ -14,11 +14,13 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.kafka.connect.storage;
+package org.apache.kafka.connect.converters;
 
 import org.apache.kafka.common.serialization.FloatDeserializer;
 import org.apache.kafka.common.serialization.FloatSerializer;
 import org.apache.kafka.connect.data.Schema;
+import org.apache.kafka.connect.storage.Converter;
+import org.apache.kafka.connect.storage.HeaderConverter;
 
 /**
  * {@link Converter} and {@link HeaderConverter} implementation that only 
supports serializing to and deserializing from float values.
diff --git 
a/connect/api/src/main/java/org/apache/kafka/connect/storage/IntegerConverter.java
 
b/connect/runtime/src/main/java/org/apache/kafka/connect/converters/IntegerConverter.java
similarity index 91%
rename from 
connect/api/src/main/java/org/apache/kafka/connect/storage/IntegerConverter.java
rename to 
connect/runtime/src/main/java/org/apache/kafka/connect/converters/IntegerConverter.java
index 6f3c78a0a73..f5388ce35bb 100644
--- 
a/connect/api/src/main/java/org/apache/kafka/connect/storage/IntegerConverter.java
+++ 
b/connect/runtime/src/main/java/org/apache/kafka/connect/converters/IntegerConverter.java
@@ -14,11 +14,13 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.kafka.connect.storage;
+package org.apache.kafka.connect.converters;
 
 import org.apache.kafka.common.serialization.IntegerDeserializer;
 import org.apache.kafka.common.serialization.IntegerSerializer;
 import org.apache.kafka.connect.data.Schema;
+import org.apache.kafka.connect.storage.Converter;
+import org.apache.kafka.connect.storage.HeaderConverter;
 
 /**
  * {@link Converter} and {@link HeaderConverter} implementation that only 
supports serializing to and deserializing from integer values.
diff --git 
a/connect/api/src/main/java/org/apache/kafka/connect/storage/LongConverter.java 
b/connect/runtime/src/main/java/org/apache/kafka/connect/converters/LongConverter.java
similarity index 91%
rename from 
connect/api/src/main/java/org/apache/kafka/connect/storage/LongConverter.java
rename to 
connect/runtime/src/main/java/org/apache/kafka/connect/converters/LongConverter.java
index 600c3042502..f91f4fad963 100644
--- 
a/connect/api/src/main/java/org/apache/kafka/connect/storage/LongConverter.java
+++ 

[jira] [Resolved] (KAFKA-7056) Connect's new numeric converters should be in a different package

2018-06-15 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-7056.
--
   Resolution: Fixed
Fix Version/s: 2.1.0

Issue resolved by pull request 5222
[https://github.com/apache/kafka/pull/5222]

> Connect's new numeric converters should be in a different package
> -
>
> Key: KAFKA-7056
> URL: https://issues.apache.org/jira/browse/KAFKA-7056
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.0.0
>Reporter: Randall Hauch
>Assignee: Randall Hauch
>Priority: Critical
> Fix For: 2.0.0, 2.1.0
>
>
> KIP-305 added several new primitive converters, but placed them alongside 
> {{StringConverter}} in the {{...connect.storage}} package rather than 
> alongside {{ByteArrayConverter}} in the {{...connect.converters}} package. We 
> should move them to the {{converters}} package. See 
> https://github.com/apache/kafka/pull/5198 for a discussion.
> Need to also update the plugins whitelist (see KAFKA-7043).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7064) "Unexpected resource type GROUP" when describing broker configs using latest admin client

2018-06-15 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7064:
---
Fix Version/s: 2.0.0

> "Unexpected resource type GROUP" when describing broker configs using latest 
> admin client
> -
>
> Key: KAFKA-7064
> URL: https://issues.apache.org/jira/browse/KAFKA-7064
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Rohan Desai
>Assignee: Andy Coates
>Priority: Blocker
> Fix For: 2.0.0
>
>
> I'm getting the following error when I try to describe broker configs using 
> the admin client:
> {code:java}
> org.apache.kafka.common.errors.InvalidRequestException: Unexpected resource 
> type GROUP for resource 0{code}
> I think its due to this commit: 
> [https://github.com/apache/kafka/commit/49db5a63c043b50c10c2dfd0648f8d74ee917b6a]
>  
> My guess at what's going on is that now that the client is using 
> ConfigResource instead of Resource it's sending a describe request for 
> resource type BROKER w/ id 3, while the broker associates id 3 w/ GROUP



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-7064) "Unexpected resource type GROUP" when describing broker configs using latest admin client

2018-06-15 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reassigned KAFKA-7064:
--

Assignee: Andy Coates

> "Unexpected resource type GROUP" when describing broker configs using latest 
> admin client
> -
>
> Key: KAFKA-7064
> URL: https://issues.apache.org/jira/browse/KAFKA-7064
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Rohan Desai
>Assignee: Andy Coates
>Priority: Blocker
> Fix For: 2.0.0
>
>
> I'm getting the following error when I try to describe broker configs using 
> the admin client:
> {code:java}
> org.apache.kafka.common.errors.InvalidRequestException: Unexpected resource 
> type GROUP for resource 0{code}
> I think its due to this commit: 
> [https://github.com/apache/kafka/commit/49db5a63c043b50c10c2dfd0648f8d74ee917b6a]
>  
> My guess at what's going on is that now that the client is using 
> ConfigResource instead of Resource it's sending a describe request for 
> resource type BROKER w/ id 3, while the broker associates id 3 w/ GROUP



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7064) "Unexpected resource type GROUP" when describing broker configs using latest admin client

2018-06-15 Thread Ismael Juma (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514379#comment-16514379
 ] 

Ismael Juma commented on KAFKA-7064:


This is a regression and it needs to be fixed before 2.0.0, cc [~rsivaram].

> "Unexpected resource type GROUP" when describing broker configs using latest 
> admin client
> -
>
> Key: KAFKA-7064
> URL: https://issues.apache.org/jira/browse/KAFKA-7064
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Rohan Desai
>Assignee: Andy Coates
>Priority: Blocker
> Fix For: 2.0.0
>
>
> I'm getting the following error when I try to describe broker configs using 
> the admin client:
> {code:java}
> org.apache.kafka.common.errors.InvalidRequestException: Unexpected resource 
> type GROUP for resource 0{code}
> I think its due to this commit: 
> [https://github.com/apache/kafka/commit/49db5a63c043b50c10c2dfd0648f8d74ee917b6a]
>  
> My guess at what's going on is that now that the client is using 
> ConfigResource instead of Resource it's sending a describe request for 
> resource type BROKER w/ id 3, while the broker associates id 3 w/ GROUP



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7064) "Unexpected resource type GROUP" when describing broker configs using latest admin client

2018-06-15 Thread Rohan Desai (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohan Desai updated KAFKA-7064:
---
Description: 
I'm getting the following error when I try to describe broker configs using the 
admin client:
{code:java}
org.apache.kafka.common.errors.InvalidRequestException: Unexpected resource 
type GROUP for resource 0{code}
I think its due to this commit: 
[https://github.com/apache/kafka/commit/49db5a63c043b50c10c2dfd0648f8d74ee917b6a]
 

My guess at what's going on is that now that the client is using ConfigResource 
instead of Resource it's sending a describe request for resource type BROKER w/ 
id 3, while the broker associates id 3 w/ GROUP

  was:
I'm getting the following error when I try to describe broker configs using the 
admin client:

 

```org.apache.kafka.common.errors.InvalidRequestException: Unexpected resource 
type GROUP for resource 0```

 

I think its due to this commit: 
https://github.com/apache/kafka/commit/49db5a63c043b50c10c2dfd0648f8d74ee917b6a

 

My guess at what's going on is that now that the client is using ConfigResource 
instead of Resource it's sending a describe request for resource type BROKER w/ 
id 3, while the broker associates id 3 w/ GROUP


> "Unexpected resource type GROUP" when describing broker configs using latest 
> admin client
> -
>
> Key: KAFKA-7064
> URL: https://issues.apache.org/jira/browse/KAFKA-7064
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Rohan Desai
>Priority: Blocker
>
> I'm getting the following error when I try to describe broker configs using 
> the admin client:
> {code:java}
> org.apache.kafka.common.errors.InvalidRequestException: Unexpected resource 
> type GROUP for resource 0{code}
> I think its due to this commit: 
> [https://github.com/apache/kafka/commit/49db5a63c043b50c10c2dfd0648f8d74ee917b6a]
>  
> My guess at what's going on is that now that the client is using 
> ConfigResource instead of Resource it's sending a describe request for 
> resource type BROKER w/ id 3, while the broker associates id 3 w/ GROUP



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7064) "Unexpected resource type GROUP" when describing broker configs using latest admin client

2018-06-15 Thread Rohan Desai (JIRA)
Rohan Desai created KAFKA-7064:
--

 Summary: "Unexpected resource type GROUP" when describing broker 
configs using latest admin client
 Key: KAFKA-7064
 URL: https://issues.apache.org/jira/browse/KAFKA-7064
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Rohan Desai


I'm getting the following error when I try to describe broker configs using the 
admin client:

 

```org.apache.kafka.common.errors.InvalidRequestException: Unexpected resource 
type GROUP for resource 0```

 

I think its due to this commit: 
https://github.com/apache/kafka/commit/49db5a63c043b50c10c2dfd0648f8d74ee917b6a

 

My guess at what's going on is that now that the client is using ConfigResource 
instead of Resource it's sending a describe request for resource type BROKER w/ 
id 3, while the broker associates id 3 w/ GROUP



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-7012) Performance issue upgrading to kafka 1.0.1 or 1.1

2018-06-15 Thread radai rosenblatt (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513037#comment-16513037
 ] 

radai rosenblatt edited comment on KAFKA-7012 at 6/15/18 8:48 PM:
--

i dont have the time to pick this up right now.

IIRC the original PR (https://github.com/apache/kafka/pull/2330) had a more 
complicated condition for when a key (channel) gets picked into 
`keysWithBufferedRead`. the codition is currently
{code}
if (channel.hasBytesBuffered()) {
//this channel has bytes enqueued in intermediary buffers 
that we could not read
//(possibly because no memory). it may be the case that the 
underlying socket will
//not come up in the next poll() and so we need to remember 
this channel for the
//next poll call otherwise data may be stuck in said 
buffers forever.
keysWithBufferedRead.add(key);
}
{code}
which results in lots of "false positives" - keys that have something left over 
in ssl buffers (likely, since request sizes are rarely a multiple of ssl cipher 
block sizes) that cause the next poll() cycle to be inefficient.

the conditions needs to check if a channel has something left *that could not 
be read out due to memory pressure*.

alternatively - the doomsday scenario this is meant to handle is pretty rare: 
if a channel has a request fully inside the ssl buffers that cannot be read due 
to memory pressure, *and* the underlying channel will never have any more 
incoming bytes (so will never come back from select) the request will sit there 
and rot resulting in a client timeout.

the alternative to making the condition more complicated is not treating this 
case at all and suffering the (rare?) timeout?


was (Author: radai):
i dont have the time to pick this up right now.

IIRC the original PR (https://github.com/apache/kafka/pull/2330) had a more 
complicated condition for when a key (channel) gets picked into 
`keysWithBufferedRead`. the codition is currently
{code}
if (channel.hasBytesBuffered()) {
//this channel has bytes enqueued in intermediary buffers 
that we could not read
//(possibly because no memory). it may be the case that the 
underlying socket will
//not come up in the next poll() and so we need to remember 
this channel for the
//next poll call otherwise data may be stuck in said 
buffers forever.
keysWithBufferedRead.add(key);
}
{code}
which results in lots of "false positives" - keys that have something left over 
in ssl buffers (likely, since request sizes are rarely a multiple of ssl cipher 
block sizes) that cause the next poll() cycle to be inefficient.

the conditions needs to check if a channel has something left *that could not 
be read out due to memory pressure*.

alternatively - the doomsday scenario this is meant to handle is pretty rare: 
if a channel has a request fully inside the ssl buffers that cannot be read due 
to memory pressure, *and* the underlying channel will never have any more 
incoming bytes (so will never come back from select) the request will sit there 
anr rot resulting in a client timeout.

the alternative to making the condition more complicated is not treating this 
case at all and suffering the (rare?) timeout?

> Performance issue upgrading to kafka 1.0.1 or 1.1
> -
>
> Key: KAFKA-7012
> URL: https://issues.apache.org/jira/browse/KAFKA-7012
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.1.0, 1.0.1
>Reporter: rajadayalan perumalsamy
>Assignee: praveen
>Priority: Major
>  Labels: regression
> Attachments: Commit-47ee8e954-0607-bufferkeys-nopoll-profile.png, 
> Commit-47ee8e954-0607-memory.png, Commit-47ee8e954-0607-profile.png, 
> Commit-47ee8e954-profile.png, Commit-47ee8e954-profile2.png, 
> Commit-f15cdbc91b-profile.png, Commit-f15cdbc91b-profile2.png
>
>
> We are trying to upgrade kafka cluster from Kafka 0.11.0.1 to Kafka 1.0.1. 
> After upgrading 1 node on the cluster, we notice that network threads use 
> most of the cpu. It is a 3 node cluster with 15k messages/sec on each node. 
> With Kafka 0.11.0.1 typical usage of the servers is around 50 to 60% 
> vcpu(using less than 1 vcpu). After upgrade we are noticing that cpu usage is 
> high depending on the number of network threads used. If networks threads is 
> set to 8, then the cpu usage is around 850%(9 vcpus) and if it is set to 4 
> then the cpu usage is around 450%(5 vcpus). Using the same kafka 
> server.properties for both.
> Did further analysis with git bisect, couple of build and deploys, traced the 
> issue to commit 

[jira] [Commented] (KAFKA-6975) AdminClient.deleteRecords() may cause replicas unable to fetch from beginning

2018-06-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514315#comment-16514315
 ] 

ASF GitHub Bot commented on KAFKA-6975:
---

hachikuji closed pull request #5235: KAFKA-6975; Fix replica fetching from 
non-batch-aligned log start offset
URL: https://github.com/apache/kafka/pull/5235
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/core/src/main/scala/kafka/cluster/Partition.scala 
b/core/src/main/scala/kafka/cluster/Partition.scala
index e038b5885b7..55edd10b5a3 100755
--- a/core/src/main/scala/kafka/cluster/Partition.scala
+++ b/core/src/main/scala/kafka/cluster/Partition.scala
@@ -22,6 +22,7 @@ import java.util.concurrent.locks.ReentrantReadWriteLock
 import com.yammer.metrics.core.Gauge
 import kafka.admin.AdminUtils
 import kafka.api.LeaderAndIsr
+import kafka.common.UnexpectedAppendOffsetException
 import kafka.controller.KafkaController
 import kafka.log.{LogAppendInfo, LogConfig}
 import kafka.metrics.KafkaMetricsGroup
@@ -29,7 +30,7 @@ import kafka.server._
 import kafka.utils.CoreUtils.{inReadLock, inWriteLock}
 import kafka.utils._
 import org.apache.kafka.common.TopicPartition
-import org.apache.kafka.common.errors.{NotEnoughReplicasException, 
NotLeaderForPartitionException, PolicyViolationException}
+import org.apache.kafka.common.errors.{ReplicaNotAvailableException, 
NotEnoughReplicasException, NotLeaderForPartitionException, 
PolicyViolationException}
 import org.apache.kafka.common.protocol.Errors
 import org.apache.kafka.common.protocol.Errors._
 import org.apache.kafka.common.record.MemoryRecords
@@ -155,6 +156,10 @@ class Partition(val topic: String,
 
   def getReplica(replicaId: Int = localBrokerId): Option[Replica] = 
Option(assignedReplicaMap.get(replicaId))
 
+  def getReplicaOrException(replicaId: Int = localBrokerId): Replica =
+getReplica(replicaId).getOrElse(
+  throw new ReplicaNotAvailableException(s"Replica $replicaId is not 
available for partition $topicPartition"))
+
   def leaderReplicaIfLocal: Option[Replica] =
 leaderReplicaIdOpt.filter(_ == localBrokerId).flatMap(getReplica)
 
@@ -486,6 +491,31 @@ class Partition(val topic: String,
 laggingReplicas
   }
 
+  def appendRecordsToFollower(records: MemoryRecords) {
+try {
+  getReplicaOrException().log.get.appendAsFollower(records)
+} catch {
+  case e: UnexpectedAppendOffsetException =>
+val replica = getReplicaOrException()
+val logEndOffset = replica.logEndOffset.messageOffset
+if (logEndOffset == replica.logStartOffset &&
+e.firstOffset < logEndOffset && e.lastOffset >= logEndOffset) {
+  // This may happen if the log start offset on the leader (or current 
replica) falls in
+  // the middle of the batch due to delete records request and the 
follower tries to
+  // fetch its first offset from the leader.
+  // We handle this case here instead of Log#append() because we will 
need to remove the
+  // segment that start with log start offset and create a new one 
with earlier offset
+  // (base offset of the batch), which will move recoveryPoint 
backwards, so we will need
+  // to checkpoint the new recovery point before we append
+  info(s"Unexpected offset in append to $topicPartition. First offset 
${e.firstOffset} is less than log start offset ${replica.logStartOffset}." +
+   s" Since this is the first record to be appended to the 
follower's log, will start the log from offset ${e.firstOffset}.")
+  logManager.truncateFullyAndStartAt(topicPartition, e.firstOffset)
+  replica.log.get.appendAsFollower(records)
+} else
+  throw e
+}
+  }
+
   def appendRecordsToLeader(records: MemoryRecords, isFromClient: Boolean, 
requiredAcks: Int = 0): LogAppendInfo = {
 val (info, leaderHWIncremented) = inReadLock(leaderIsrUpdateLock) {
   leaderReplicaIfLocal match {
diff --git a/core/src/main/scala/kafka/common/OffsetsOutOfOrderException.scala 
b/core/src/main/scala/kafka/common/OffsetsOutOfOrderException.scala
new file mode 100644
index 000..f8daaa4a181
--- /dev/null
+++ b/core/src/main/scala/kafka/common/OffsetsOutOfOrderException.scala
@@ -0,0 +1,25 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ * 
+ * 

[jira] [Resolved] (KAFKA-5590) Delete Kafka Topic Complete Failed After Enable Ranger Kafka Plugin

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-5590.
--
Resolution: Information Provided

{color:#00}Closing inactive issue. Please reopen if you think the issue 
still exists{color}
 

> Delete Kafka Topic Complete Failed After Enable Ranger Kafka Plugin
> ---
>
> Key: KAFKA-5590
> URL: https://issues.apache.org/jira/browse/KAFKA-5590
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.10.0.0
> Environment: kafka and ranger under ambari
>Reporter: Chaofeng Zhao
>Priority: Major
>
> Hi:
> Recently I develop some applications about kafka under ranger. But when I 
> set enable ranger kafka plugin I can not delete kafka topic completely even 
> though set 'delete.topic.enable=true'. And I find when enable ranger kafka 
> plugin it must be authrized. How can I delete kafka topic completely under 
> ranger. Thank you.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-5691) ZKUtils.CreateRecursive should handle NOAUTH

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-5691.
--
Resolution: Auto Closed

Scala consumers and related tools/tests are removed in KAFKA-2983. This change 
may not be required now.  Please reopen if you think otherwise

 

> ZKUtils.CreateRecursive should handle NOAUTH
> 
>
> Key: KAFKA-5691
> URL: https://issues.apache.org/jira/browse/KAFKA-5691
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ryan P
>Assignee: Ryan P
>Priority: Major
>
> Old consumers are unable to register themselves with secured ZK installations 
> because a NOATH code is returned when attempting to create `/consumers'. 
> Rather than failing Kafka should log the error and continue down the path 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-5408) Using the ConsumerRecord instead of BaseConsumerRecord in the ConsoleConsumer

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-5408.
--
Resolution: Fixed

This is taken care in KAFKA-2983

> Using the ConsumerRecord instead of BaseConsumerRecord in the ConsoleConsumer
> -
>
> Key: KAFKA-5408
> URL: https://issues.apache.org/jira/browse/KAFKA-5408
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Paolo Patierno
>Priority: Minor
>
> Hi,
> because the BaseConsumerRecord is marked as deprecated and will be removed in 
> future versions, it could worth to start removing its usage in the 
> ConsoleConsumer. 
> If it makes sense to you, I'd like to work on that starting to contribute to 
> the project.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-4723) offsets.storage=kafka - groups stuck in rebalancing with committed offsets

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-4723.
--
Resolution: Auto Closed

Closing inactive issue. The old consumer is no longer supported. Please upgrade 
to the Java consumer whenever possible.

> offsets.storage=kafka - groups stuck in rebalancing with committed offsets
> --
>
> Key: KAFKA-4723
> URL: https://issues.apache.org/jira/browse/KAFKA-4723
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1
>Reporter: biker73
>Priority: Minor
>
> Hi, I have moved offset store to kafka only, when I now run;
>  bin/kafka-consumer-groups.sh  --bootstrap-server localhost:9094 --describe  
> --new-consumer --group my_consumer_group
> I get the message;
> Consumer group `my_consumer_group` does not exist or is rebalancing.
> I have found the  issue KAFKA-3144 however this refers to consumer groups 
> that have no committed offsets, the groups I am looking do and are constantly 
> in use.
> using --list I get all my consumer groups returned. Although some are 
> inactive I have around 6 very active ones (millions of messages a day 
> constantly). looking at the mbean data and kafka tool etc I can see the lags 
> and offsets changing every second. Therefore I would expect the 
> kafka-consumer-groups.sh script to return the lags and offsets for all 6 
> active consumer groups.
> I think what has happened is when I moved offset storage to kafka from 
> zookeeper (and then disabled sending to both), something has got confused.  
> Querying zookeeper I get the offsets for the alleged missing consumer groups 
> - but they should be stored and committed to kafka.
> Many thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-4295) kafka-console-consumer.sh does not delete the temporary group in zookeeper

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-4295.
--
Resolution: Auto Closed

Closing inactive issue. The old consumer is no longer supported.

> kafka-console-consumer.sh does not delete the temporary group in zookeeper
> --
>
> Key: KAFKA-4295
> URL: https://issues.apache.org/jira/browse/KAFKA-4295
> Project: Kafka
>  Issue Type: Bug
>  Components: admin
>Affects Versions: 0.9.0.1, 0.10.0.0, 0.10.0.1
>Reporter: Sswater Shi
>Assignee: huxihx
>Priority: Minor
>
> I'm not sure it is a bug or you guys designed it.
> Since 0.9.x.x, the kafka-console-consumer.sh will not delete the group 
> information in zookeeper/consumers on exit when without "--new-consumer". 
> There will be a lot of abandoned zookeeper/consumers/console-consumer-xxx if 
> kafka-console-consumer.sh runs a lot of times.
> When 0.8.x.x,  the kafka-console-consumer.sh can be followed by an argument 
> "group". If not specified, the kafka-console-consumer.sh will create a 
> temporary group name like 'console-consumer-'. If the group name is 
> specified by "group", the information in the zookeeper/consumers will be kept 
> on exit. If the group name is a temporary one, the information in the 
> zookeeper will be deleted when kafka-console-consumer.sh is quitted by 
> Ctrl+C. Why this is changed from 0.9.x.x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-4196) Transient test failure: DeleteConsumerGroupTest.testConsumptionOnRecreatedTopicAfterTopicWideDeleteInZK

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-4196.
--
Resolution: Auto Closed

Scala consumers and related tools/tests are removed in KAFKA-2983

> Transient test failure: 
> DeleteConsumerGroupTest.testConsumptionOnRecreatedTopicAfterTopicWideDeleteInZK
> ---
>
> Key: KAFKA-4196
> URL: https://issues.apache.org/jira/browse/KAFKA-4196
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ismael Juma
>Priority: Major
>  Labels: transient-unit-test-failure
>
> The error:
> {code}
> java.lang.AssertionError: Admin path /admin/delete_topic/topic path not 
> deleted even after a replica is restarted
>   at org.junit.Assert.fail(Assert.java:88)
>   at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:752)
>   at kafka.utils.TestUtils$.verifyTopicDeletion(TestUtils.scala:1017)
>   at 
> kafka.admin.DeleteConsumerGroupTest.testConsumptionOnRecreatedTopicAfterTopicWideDeleteInZK(DeleteConsumerGroupTest.scala:156)
> {code}
> Caused by a broken invariant in the Controller: a partition exists in 
> `ControllerContext.partitionLeadershipInfo`, but not 
> `controllerContext.partitionReplicaAssignment`.
> {code}
> [2016-09-20 06:45:13,967] ERROR [BrokerChangeListener on Controller 1]: Error 
> while handling broker changes 
> (kafka.controller.ReplicaStateMachine$BrokerChangeListener:103)
> java.util.NoSuchElementException: key not found: [topic,0]
>   at scala.collection.MapLike$class.default(MapLike.scala:228)
>   at scala.collection.AbstractMap.default(Map.scala:58)
>   at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
>   at 
> kafka.controller.ControllerBrokerRequestBatch.kafka$controller$ControllerBrokerRequestBatch$$updateMetadataRequestMapFor$1(ControllerChannelManager.scala:310)
>   at 
> kafka.controller.ControllerBrokerRequestBatch$$anonfun$addUpdateMetadataRequestForBrokers$4.apply(ControllerChannelManager.scala:343)
>   at 
> kafka.controller.ControllerBrokerRequestBatch$$anonfun$addUpdateMetadataRequestForBrokers$4.apply(ControllerChannelManager.scala:343)
>   at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
>   at 
> kafka.controller.ControllerBrokerRequestBatch.addUpdateMetadataRequestForBrokers(ControllerChannelManager.scala:343)
>   at 
> kafka.controller.KafkaController.sendUpdateMetadataRequest(KafkaController.scala:1030)
>   at 
> kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:492)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:376)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:358)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:358)
>   at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:357)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:356)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:356)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:355)
>   at org.I0Itec.zkclient.ZkClient$10.run(ZkClient.java:843)
>   at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-3150) kafka.tools.UpdateOffsetsInZK not work (sasl enabled)

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-3150.
--
Resolution: Auto Closed

The Scala consumers and related tools are removed in KAFKA-2983

> kafka.tools.UpdateOffsetsInZK not work (sasl enabled)
> -
>
> Key: KAFKA-3150
> URL: https://issues.apache.org/jira/browse/KAFKA-3150
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
> Environment: redhat as6.5
>Reporter: linbao111
>Priority: Major
>
> ./bin/kafka-run-class.sh kafka.tools.UpdateOffsetsInZK earliest 
> config/consumer.properties   alalei_2  
> [2016-01-26 17:20:49,920] WARN Property sasl.kerberos.service.name is not 
> valid (kafka.utils.VerifiableProperties)
> [2016-01-26 17:20:49,920] WARN Property security.protocol is not valid 
> (kafka.utils.VerifiableProperties)
> Exception in thread "main" kafka.common.BrokerEndPointNotAvailableException: 
> End point PLAINTEXT not found for broker 1
> at kafka.cluster.Broker.getBrokerEndPoint(Broker.scala:136)
> at 
> kafka.tools.UpdateOffsetsInZK$$anonfun$getAndSetOffsets$2.apply$mcVI$sp(UpdateOffsetsInZK.scala:70)
> at 
> kafka.tools.UpdateOffsetsInZK$$anonfun$getAndSetOffsets$2.apply(UpdateOffsetsInZK.scala:59)
> at 
> kafka.tools.UpdateOffsetsInZK$$anonfun$getAndSetOffsets$2.apply(UpdateOffsetsInZK.scala:59)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at 
> kafka.tools.UpdateOffsetsInZK$.getAndSetOffsets(UpdateOffsetsInZK.scala:59)
> at kafka.tools.UpdateOffsetsInZK$.main(UpdateOffsetsInZK.scala:43)
> at kafka.tools.UpdateOffsetsInZK.main(UpdateOffsetsInZK.scala)
> same error for:
> ./bin/kafka-consumer-offset-checker.sh  --broker-info --group 
> test-consumer-group --topic alalei_2 --zookeeper slave1:2181
> [2016-01-26 17:23:45,218] WARN WARNING: ConsumerOffsetChecker is deprecated 
> and will be dropped in releases following 0.9.0. Use ConsumerGroupCommand 
> instead. (kafka.tools.ConsumerOffsetChecker$)
> Exiting due to: End point PLAINTEXT not found for broker 0.
> ./bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --zookeeper 
> slave1:2181 --group  test-consumer-group
> [2016-01-26 17:26:15,075] WARN WARNING: ConsumerOffsetChecker is deprecated 
> and will be dropped in releases following 0.9.0. Use ConsumerGroupCommand 
> instead. (kafka.tools.ConsumerOffsetChecker$)
> Exiting due to: End point PLAINTEXT not found for broker 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-3057) "Checking consumer position" docs are referencing (only) deprecated ConsumerOffsetChecker

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-3057.
--
Resolution: Fixed

Closing this as docs updated for kafka-consumer-groups.sh.  documentation about 
kafka-consumer-groups.sh's reset offsets feature tracked via KAFKA-6312

> "Checking consumer position" docs are referencing (only) deprecated 
> ConsumerOffsetChecker
> -
>
> Key: KAFKA-3057
> URL: https://issues.apache.org/jira/browse/KAFKA-3057
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, website
>Affects Versions: 0.9.0.0
>Reporter: Stevo Slavic
>Priority: Trivial
>
> ["Checking consumer position" operations 
> instructions|http://kafka.apache.org/090/documentation.html#basic_ops_consumer_lag]
>  are referencing only ConsumerOffsetChecker which is mentioned as deprecated 
> in [Potential breaking changes in 
> 0.9.0.0|http://kafka.apache.org/documentation.html#upgrade_9_breaking]
> Please consider updating docs with new ways for checking consumer position, 
> covering differences between old and new way, and recommendation which one is 
> preferred and why.
> Would be nice to document (and support if not already available), not only 
> how to read/fetch/check consumer (group) offset, but also how to set offset 
> for consumer group using Kafka's operations tools.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-2537) Mirrormaker defaults to localhost with no sanity checks

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-2537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-2537.
--
Resolution: Auto Closed

The old consumer is no longer supported. All the usages are removed in 
KAFKA-2983

> Mirrormaker defaults to localhost with no sanity checks
> ---
>
> Key: KAFKA-2537
> URL: https://issues.apache.org/jira/browse/KAFKA-2537
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, replication, zkclient
>Affects Versions: 0.8.2.0
>Reporter: Evan Huus
>Assignee: Neha Narkhede
>Priority: Major
>
> Short version: Like many other tools, mirror-maker's consumer defaults to 
> using the localhost zookeeper instance when no specific zookeeper source is 
> specified. It shouldn't do this. MM should also have a sanity check that the 
> source and destination clusters are different.
> Long version: We run multiple clusters, all using mirrormaker to replicate to 
> the master cluster. The kafka, zookeeper, and mirrormaker instances all run 
> on the same nodes in the master cluster since the hardware can more than 
> handle the load. We were doing some zookeeper maintenance on one of our 
> remote clusters recently which accidentally caused our configuration manager 
> (chef) to generate empty zkConnect strings for some mirrormaker instances. 
> These instances defaulted to localhost and started mirroring from the master 
> cluster back to itself, an infinite replication loop that caused all sorts of 
> havok.
> We were able to recover gracefully and we've added additional safe-guards on 
> our end, but mirror-maker is at least partially at fault here as well. There 
> is no reason for it to treat an empty string as anything but an error - 
> especially not localhost, which is typically the target cluster, not the 
> source. Additionally, it should be trivial and very useful for mirrormaker to 
> verify it is not consuming and producing from the same cluster; I can think 
> of no legitimate use case for this kind of cycle.
> If you need any clarification or additional information, please let me know.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-1942) ConsumerGroupCommand does not show offset information in ZK for deleted topic

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-1942.
--
Resolution: Auto Closed

Closing as old consumer support is removed from ConsumerGroupCommand

> ConsumerGroupCommand does not show offset information in ZK for deleted topic
> -
>
> Key: KAFKA-1942
> URL: https://issues.apache.org/jira/browse/KAFKA-1942
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Priority: Major
> Attachments: delete-topic-describe-group-2-10-2015.txt
>
>
> Let's say group g consumes from topic t. If we delete topic t using 
> kafka-topics.sh and then try to describe group g using ConsumerGroupCommand, 
> it won't show any of the partition rows for topic t even though the group has 
> offset information for t still in zk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-7048) NPE when creating connector

2018-06-15 Thread Robert Yokota (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513966#comment-16513966
 ] 

Robert Yokota edited comment on KAFKA-7048 at 6/15/18 5:45 PM:
---

[~rsivaram], yes, this is a blocker.  Suggested a small change to the JavaDoc.  
Otherwise LGTM.


was (Author: rayokota):
Yes, this is a blocker.  Suggested a small change to the JavaDoc.  Otherwise 
LGTM.

> NPE when creating connector
> ---
>
> Key: KAFKA-7048
> URL: https://issues.apache.org/jira/browse/KAFKA-7048
> Project: Kafka
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Blocker
> Fix For: 2.0.0
>
>
> KAFKA-6886 introduced the ConfigTransformer to transform the given 
> configuration data. ConfigTransformer#transform(Map) expect 
> the passed config won't be null but DistributedHerder#putConnectorConfig call 
> the #transform before updating the snapshot (see below). Hence, it cause the 
> NPE. 
> {code:java}
> // Note that we use the updated connector config despite the fact that we 
> don't have an updated
> // snapshot yet. The existing task info should still be accurate.
> Map map = configState.connectorConfig(connName);
> ConnectorInfo info = new ConnectorInfo(connName, config, 
> configState.tasks(connName),
> map == null ? null : 
> connectorTypeForClass(map.get(ConnectorConfig.CONNECTOR_CLASS_CONFIG)));
> callback.onCompletion(null, new Created<>(!exists, info));
> return null;{code}
> We can add a null check to "configs" (see below) to resolve the NPE. It means 
> we WON'T pass the null configs to configTransformer
> {code:java}
> public Map connectorConfig(String connector) {
> Map configs = connectorConfigs.get(connector);
> if (configTransformer != null) { // add a condition "configs != null"
> configs = configTransformer.transform(connector, configs);
> }
> return configs;
> }{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-857) kafka.admin.ListTopicCommand, 'by broker' display/filter

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-857.
-
Resolution: Auto Closed

Closing inactive issue.  This functionality can be achived by using 
KafkaAdminClient methods.

> kafka.admin.ListTopicCommand, 'by broker' display/filter
> 
>
> Key: KAFKA-857
> URL: https://issues.apache.org/jira/browse/KAFKA-857
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Dave DeMaagd
>Priority: Minor
>
> Would be nice if there were an option for kafka.admin.ListTopicCommand that 
> would filter results by broker (either ID of hostname).  Could be helpful in 
> troubleshooting some cases of broker problems (e.g. look at the 
> topic/partition ownership information for a particular broker, maybe for 
> underreplication, maybe to see what the impact is for messing with a 
> particular broker). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-796) Kafka Scala classes should declare thrown checked exceptions to be Java friendly

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-796.
-
Resolution: Auto Closed

Closing inactive issue. The scala clients are no longer supported.

> Kafka Scala classes should declare thrown checked exceptions to be Java 
> friendly
> 
>
> Key: KAFKA-796
> URL: https://issues.apache.org/jira/browse/KAFKA-796
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Darren Sargent
>Priority: Minor
>
> For example, ConsumerIterator makeNext() method calls BlockingQueue.take() 
> which declares it throws InterruptedException. However, since makeNext() 
> fails to redeclare this exception, Java client code will be unable to catch 
> it -- javac will complain that InterruptedException cannot be thrown.
> Workaround - in the Java client code, catch Exception then check if 
> instanceof InterruptedException and respond accordingly. But really the Scala 
> method should redeclare checked exceptions for Java's benefit, even though 
> it's not required for Scala since there are no checked exceptions.
> There may be other classes where this needs to be done as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-7063) Update documentation to remove references to old producers and consumers

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar reassigned KAFKA-7063:


Assignee: Manikumar

> Update documentation to remove references to old producers and consumers
> 
>
> Key: KAFKA-7063
> URL: https://issues.apache.org/jira/browse/KAFKA-7063
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Manikumar
>Priority: Major
>  Labels: newbie
>
> We should also remove any mention of "new consumer" or "new producer". They 
> should just be "producer" and "consumer".
> Finally, any mention of "Scala producer/consumer/client" should also be 
> removed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7021) Source KTable checkpoint is not correct

2018-06-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514001#comment-16514001
 ] 

ASF GitHub Bot commented on KAFKA-7021:
---

guozhangwang closed pull request #5195: KAFKA-7021: Update upgrade guide 
section for reusing source topic
URL: https://github.com/apache/kafka/pull/5195
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/streams/upgrade-guide.html b/docs/streams/upgrade-guide.html
index 07f85446424..cd9278262b7 100644
--- a/docs/streams/upgrade-guide.html
+++ b/docs/streams/upgrade-guide.html
@@ -34,7 +34,7 @@ Upgrade Guide and API Changes
 
 
 
-Upgrading from any older version to 2.0.0 is possible: (1) you need to 
make sure to update you code accordingly, because there are some minor 
non-compatible API changes since older
+Upgrading from any older version to 2.0.0 is possible: (1) you need to 
make sure to update you code and config accordingly, because there are some 
minor non-compatible API changes since older
 releases (the code changes are expected to be minimal, please see 
below for the details),
 (2) upgrading to 2.0.0 in the online mode requires two rolling bounces.
 For (2), in the first rolling bounce phase users need to set config 
upgrade.from="older version" (possible values are "0.10.0", 
"0.10.1", "0.10.2", "0.11.0", "1.0", and "1.1")
@@ -59,6 +59,16 @@ Upgrade Guide and API Changes
 For Kafka Streams 0.10.0, broker version 0.10.0 or higher is required.
 
 
+
+Another important thing to keep in mind: in deprecated 
KStreamBuilder class, when a KTable is created from a 
source topic via KStreamBuilder.table(), its materialized state 
store
+will reuse the source topic as its changelog topic for restoring, and 
will disable logging to avoid appending new updates to the source topic; in the 
StreamsBuilder class introduced in 1.0, this behavior was changed
+accidentally: we still reuse the source topic as the changelog topic 
for restoring, but will also create a separate changelog topic to append the 
update records from source topic to. In the 2.0 release, we have fixed this 
issue and now users
+can choose whether or not to reuse the source topic based on the 
StreamsConfig#TOPOLOGY_OPTIMIZATION: if you are upgrading from the 
old KStreamBuilder class and hence you need to change your code to 
use
+the new StreamsBuilder, you should set this config value 
to StreamsConfig#OPTIMIZE to continue reusing the source topic; if 
you are upgrading from 1.0 or 1.1 where you are already using 
StreamsBuilder and hence have already
+created a separate changelog topic, you should set this config value 
to StreamsConfig#NO_OPTIMIZATION when upgrading to 2.0.0 in order 
to use that changelog topic for restoring the state store.
+More details about the new config 
StreamsConfig#TOPOLOGY_OPTIMIZATION can be found in https://cwiki.apache.org/confluence/display/KAFKA/KIP-295%3A+Add+Streams+Configuration+Allowing+for+Optional+Topology+Optimization;>KIP-295.
+
+
 
 In 2.0.0 we have added a few new APIs on the 
ReadOnlyWindowStore interface (for details please read Streams API changes below).
 If you have customized window store implementations that extends the 
ReadOnlyWindowStore interface you need to make code changes.


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Source KTable checkpoint is not correct
> ---
>
> Key: KAFKA-7021
> URL: https://issues.apache.org/jira/browse/KAFKA-7021
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Matthias J. Sax
>Assignee: Guozhang Wang
>Priority: Major
> Fix For: 0.10.2.2, 2.0.0, 0.11.0.3, 1.0.2, 1.1.1
>
>
> Kafka Streams treats source KTables,ie, table created via `builder.table()`, 
> differently. Instead of creating a changelog topic, the original source topic 
> is use to avoid unnecessary data redundancy.
> However, Kafka Streams does not write a correct local state checkpoint file. 
> This results in unnecessary state restore after a rebalance. Instead of the 
> latest committed offset, the latest restored offset is written into the 
> checkpoint file in `ProcessorStateManager#close()`



--

[jira] [Resolved] (KAFKA-3798) Kafka Consumer 0.10.0.0 killed after rebalancing exception

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-3798.
--
Resolution: Auto Closed

Closing inactive issue. The old consumer is no longer supported. Please upgrade 
to the Java consumer whenever possible.

> Kafka Consumer 0.10.0.0 killed after rebalancing exception
> --
>
> Key: KAFKA-3798
> URL: https://issues.apache.org/jira/browse/KAFKA-3798
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Affects Versions: 0.10.0.0
> Environment: Production
>Reporter: Sahitya Agrawal
>Assignee: Neha Narkhede
>Priority: Major
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Hi , 
> I have a topic with 100 partitions and 25 consumers. Consumers were working 
> fine up to some time. 
> After some time I see kafka rebalancing exception in the logs. CPU usage is 
> also 100 % at that time. Consumer process got killed after that. 
> Kafka version : 0.10.0.0
> Some Error print from the logs are following:
> kafka.common.ConsumerRebalanceFailedException: prod_ip- can't rebalance 
> after 10 retries
> at 
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:670)
> at 
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$2.run(ZookeeperConsumerConnector.scala:589)
> exception during rebalance
> org.I0Itec.zkclient.exception.ZkNoNodeException: 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /consumers/prod/ids/prod_ip-***
> at 
> org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
> at 
> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:1000)
> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:1099)
> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:1094)
> at kafka.utils.ZkUtils.readData(ZkUtils.scala:542)
> at kafka.consumer.TopicCount$.constructTopicCount(TopicCount.scala:61)
> at 
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:674)
> at 
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1$$anonfun$apply$mcV$sp$1.apply$mcVI$sp(ZookeeperConsumerConnector.scala:646)
> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
> at 
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcV$sp(ZookeeperConsumerConnector.scala:637)
> at 
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply(ZookeeperConsumerConnector.scala:637)
> at 
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply(ZookeeperConsumerConnector.scala:637)
> at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> at 
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:636)
> at 
> kafka.consumer.ZookeeperConsumerConnector$ZKSessionExpireListener.handleNewSession(ZookeeperConsumerConnector.scala:522)
> at org.I0Itec.zkclient.ZkClient$6.run(ZkClient.java:735)
> at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
> Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
> KeeperErrorCode = NoNode for /consumers/prod/ids/prod_ip-**
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1184)
> at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:124)
> at org.I0Itec.zkclient.ZkClient$12.call(ZkClient.java:1103)
> at org.I0Itec.zkclient.ZkClient$12.call(ZkClient.java:1099)
> at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:990)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-4967) java.io.EOFException Error while committing offsets

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-4967.
--
Resolution: Auto Closed

Closing inactive issue. The old consumer is no longer supported. Please upgrade 
to the Java consumer whenever possible.

> java.io.EOFException Error while committing offsets
> ---
>
> Key: KAFKA-4967
> URL: https://issues.apache.org/jira/browse/KAFKA-4967
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.0.1
> Environment: OS : CentOS
>Reporter: Upendra Yadav
>Priority: Major
>
> kafka server and client : 0.10.0.1
> And consumer and producer side using latest kafka jars as mentioned above but 
> still using old consumer apis in code. 
> kafka server side configuration :
> listeners=PLAINTEXT://:9092
> #below configuration is for old clients, that was exists before. but now 
> every clients are already moved with latest kafka client - 0.10.0.1
> log.message.format.version=0.8.2.1
> broker.id.generation.enable=false
> unclean.leader.election.enable=false
> Some of configurations for kafka consumer :
> auto.commit.enable is overridden to false
> auto.offset.reset is overridden to smallest
> consumer.timeout.ms is overridden to 100
> dual.commit.enabled is overridden to true
> fetch.message.max.bytes is overridden to 209715200
> group.id is overridden to crm_topic1_hadoop_tables
> offsets.storage is overridden to kafka
> rebalance.backoff.ms is overridden to 6000
> zookeeper.session.timeout.ms is overridden to 23000
> zookeeper.sync.time.ms is overridden to 2000
> below exception I'm getting on commit offset.
> Consumer process is still running after this exception..
> but when I'm checking offset position through kafka shell scripts its showing 
> old position(Could not fetch offset from topic1_group1 partition [topic1,0] 
> due to missing offset data in zookeeper). after some time when 2nd commit 
> comes then it get updated.
> because of duel commit enabled, I think kafka side position get update 
> successfully for both time.
> ERROR kafka.consumer.ZookeeperConsumerConnector: [], Error while 
> committing offsets.
> java.io.EOFException
> at 
> org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:83)
> at 
> kafka.network.BlockingChannel.readCompletely(BlockingChannel.scala:129)
> at kafka.network.BlockingChannel.receive(BlockingChannel.scala:120)
> at 
> kafka.consumer.ZookeeperConsumerConnector.liftedTree2$1(ZookeeperConsumerConnector.scala:354)
> at 
> kafka.consumer.ZookeeperConsumerConnector.commitOffsets(ZookeeperConsumerConnector.scala:351)
> at 
> kafka.consumer.ZookeeperConsumerConnector.commitOffsets(ZookeeperConsumerConnector.scala:331)
> at 
> kafka.javaapi.consumer.ZookeeperConsumerConnector.commitOffsets(ZookeeperConsumerConnector.scala:111)
> at 
> com.zoho.mysqlbackup.kafka.consumer.KafkaHLConsumer.commitOffset(KafkaHLConsumer.java:173)
> at 
> com.zoho.mysqlbackup.kafka.consumer.KafkaHLConsumer.run(KafkaHLConsumer.java:271)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-966) Allow high level consumer to 'nak' a message and force Kafka to close the KafkaStream without losing that message

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-966.
-
Resolution: Auto Closed

Closing inactive issue. The old consumer is no longer supported. 

> Allow high level consumer to 'nak' a message and force Kafka to close the 
> KafkaStream without losing that message
> -
>
> Key: KAFKA-966
> URL: https://issues.apache.org/jira/browse/KAFKA-966
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Affects Versions: 0.8.0
>Reporter: Chris Curtin
>Assignee: Neha Narkhede
>Priority: Minor
>
> Enhancement request.
> The high level consumer is very close to handling a lot of situations a 
> 'typical' client would need. Except for when the message received from Kafka 
> is valid, but the business logic that wants to consume it has a problem.
> For example if I want to write the value to a MongoDB or Cassandra database 
> and the database is not available. I won't know until I go to do the write 
> that the database isn't available, but by then it is too late to NOT read the 
> message from Kafka. Thus if I call shutdown() to stop reading, that message 
> is lost since the offset Kafka writes to ZooKeeper is the next offset.
> Ideally I'd like to be able to tell Kafka: close the KafkaStream but set the 
> next offset to read for this partition to this message when I start up again. 
> And if there are any messages in the BlockingQueue for other partitions, find 
> the lowest # and use it for that partitions offset since I haven't consumed 
> them yet.
> Thus I can cleanly shutdown my processing, resolve whatever the issue is and 
> restart the process.
> Another idea might be to allow a 'peek' into the next message and if I 
> succeed in writing to the database call 'next' to remove it from the queue. 
> I understand this won't deal with a 'kill -9' or hard failure of the JVM 
> leading to the latest offsets not being written to ZooKeeper but it addresses 
> a likely common scenario for consumers. Nor will it add true transactional 
> support since the ZK update could fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-2546) CPU utilization very high when no kafka node alive

2018-06-15 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-2546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-2546.
--
Resolution: Auto Closed

Closing inactive issue. The old consumer is no longer supported. Please upgrade 
to the Java consumer whenever possible.

> CPU utilization very high when no kafka node alive
> --
>
> Key: KAFKA-2546
> URL: https://issues.apache.org/jira/browse/KAFKA-2546
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.8.2.1
>Reporter: Diego Erdody
>Assignee: Neha Narkhede
>Priority: Minor
>
> If you call kafka.consumer.Consumer.createJavaConsumerConnector, and no 
> broker is found in ZK, you end up in 
> kafka.client.ClientUtils.channelToAnyBroker looping continuously with no wait 
> at all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-1013) Modify existing tools as per the changes in KAFKA-1000

2018-06-15 Thread Manikumar (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513988#comment-16513988
 ] 

Manikumar edited comment on KAFKA-1013 at 6/15/18 3:42 PM:
---

ConsumerOffsetChecker tool is removed. Other tools will be removed as part 
KAFKA-2983


was (Author: omkreddy):
ConsumerOffsetChecker tool is removoed. Other tools will be removed as part 
KAFKA-2983

> Modify existing tools as per the changes in KAFKA-1000
> --
>
> Key: KAFKA-1013
> URL: https://issues.apache.org/jira/browse/KAFKA-1013
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Reporter: Tejas Patil
>Assignee: Mayuresh Gharat
>Priority: Minor
> Attachments: KAFKA-1013.patch, KAFKA-1013.patch, 
> KAFKA-1013_2014-09-23_10:45:59.patch, KAFKA-1013_2014-09-23_10:48:07.patch, 
> KAFKA-1013_2014-09-26_18:52:09.patch, KAFKA-1013_2014-10-01_21:05:00.patch, 
> KAFKA-1013_2014-10-02_22:42:58.patch, KAFKA-1013_2014-12-21_14:42:49.patch, 
> KAFKA-1013_2014-12-27_16:15:54.patch, KAFKA-1013_2014-12-27_16:20:57.patch, 
> KAFKA-1013_2015-01-13_13:56:39.patch, KAFKA-1013_2015-01-13_16:43:03.patch
>
>
> Modify existing tools as per the changes in KAFKA-1000. AFAIK, the tools 
> below would be affected:
> - ConsumerOffsetChecker
> - ExportZkOffsets
> - ImportZkOffsets
> - UpdateOffsetsInZK
> - Consumer offset checker should show the offset manager and offsets partition



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7021) Source KTable checkpoint is not correct

2018-06-15 Thread Guozhang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-7021.
--
Resolution: Fixed

> Source KTable checkpoint is not correct
> ---
>
> Key: KAFKA-7021
> URL: https://issues.apache.org/jira/browse/KAFKA-7021
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Matthias J. Sax
>Assignee: Guozhang Wang
>Priority: Major
> Fix For: 0.10.2.2, 2.0.0, 0.11.0.3, 1.0.2, 1.1.1
>
>
> Kafka Streams treats source KTables,ie, table created via `builder.table()`, 
> differently. Instead of creating a changelog topic, the original source topic 
> is use to avoid unnecessary data redundancy.
> However, Kafka Streams does not write a correct local state checkpoint file. 
> This results in unnecessary state restore after a rebalance. Instead of the 
> latest committed offset, the latest restored offset is written into the 
> checkpoint file in `ProcessorStateManager#close()`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7021) Source KTable checkpoint is not correct

2018-06-15 Thread Guozhang Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513969#comment-16513969
 ] 

Guozhang Wang commented on KAFKA-7021:
--

Cherry-picked the bug fix to 0.10.2, 0.11.0, 1.0, 1.1.

> Source KTable checkpoint is not correct
> ---
>
> Key: KAFKA-7021
> URL: https://issues.apache.org/jira/browse/KAFKA-7021
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Matthias J. Sax
>Assignee: Guozhang Wang
>Priority: Major
> Fix For: 0.10.2.2, 2.0.0, 0.11.0.3, 1.0.2, 1.1.1
>
>
> Kafka Streams treats source KTables,ie, table created via `builder.table()`, 
> differently. Instead of creating a changelog topic, the original source topic 
> is use to avoid unnecessary data redundancy.
> However, Kafka Streams does not write a correct local state checkpoint file. 
> This results in unnecessary state restore after a rebalance. Instead of the 
> latest committed offset, the latest restored offset is written into the 
> checkpoint file in `ProcessorStateManager#close()`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7048) NPE when creating connector

2018-06-15 Thread Robert Yokota (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513966#comment-16513966
 ] 

Robert Yokota commented on KAFKA-7048:
--

Yes, this is a blocker.  Suggested a small change to the JavaDoc.  Otherwise 
LGTM.

> NPE when creating connector
> ---
>
> Key: KAFKA-7048
> URL: https://issues.apache.org/jira/browse/KAFKA-7048
> Project: Kafka
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Blocker
> Fix For: 2.0.0
>
>
> KAFKA-6886 introduced the ConfigTransformer to transform the given 
> configuration data. ConfigTransformer#transform(Map) expect 
> the passed config won't be null but DistributedHerder#putConnectorConfig call 
> the #transform before updating the snapshot (see below). Hence, it cause the 
> NPE. 
> {code:java}
> // Note that we use the updated connector config despite the fact that we 
> don't have an updated
> // snapshot yet. The existing task info should still be accurate.
> Map map = configState.connectorConfig(connName);
> ConnectorInfo info = new ConnectorInfo(connName, config, 
> configState.tasks(connName),
> map == null ? null : 
> connectorTypeForClass(map.get(ConnectorConfig.CONNECTOR_CLASS_CONFIG)));
> callback.onCompletion(null, new Created<>(!exists, info));
> return null;{code}
> We can add a null check to "configs" (see below) to resolve the NPE. It means 
> we WON'T pass the null configs to configTransformer
> {code:java}
> public Map connectorConfig(String connector) {
> Map configs = connectorConfigs.get(connector);
> if (configTransformer != null) { // add a condition "configs != null"
> configs = configTransformer.transform(connector, configs);
> }
> return configs;
> }{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7012) Performance issue upgrading to kafka 1.0.1 or 1.1

2018-06-15 Thread Rajini Sivaram (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513957#comment-16513957
 ] 

Rajini Sivaram commented on KAFKA-7012:
---

[~patl] Are you actively working on this JIRA? If not, can I pick it up and 
provide a simple solution to include in 2.0.0 - we are very close to cutting 
the first RC and it will be good to include at least a simple fix for this 
issue (we can improve on that in a follow on PR if required after the release). 
Let me know if you have any objections. Thank you!

> Performance issue upgrading to kafka 1.0.1 or 1.1
> -
>
> Key: KAFKA-7012
> URL: https://issues.apache.org/jira/browse/KAFKA-7012
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.1.0, 1.0.1
>Reporter: rajadayalan perumalsamy
>Assignee: praveen
>Priority: Major
>  Labels: regression
> Attachments: Commit-47ee8e954-0607-bufferkeys-nopoll-profile.png, 
> Commit-47ee8e954-0607-memory.png, Commit-47ee8e954-0607-profile.png, 
> Commit-47ee8e954-profile.png, Commit-47ee8e954-profile2.png, 
> Commit-f15cdbc91b-profile.png, Commit-f15cdbc91b-profile2.png
>
>
> We are trying to upgrade kafka cluster from Kafka 0.11.0.1 to Kafka 1.0.1. 
> After upgrading 1 node on the cluster, we notice that network threads use 
> most of the cpu. It is a 3 node cluster with 15k messages/sec on each node. 
> With Kafka 0.11.0.1 typical usage of the servers is around 50 to 60% 
> vcpu(using less than 1 vcpu). After upgrade we are noticing that cpu usage is 
> high depending on the number of network threads used. If networks threads is 
> set to 8, then the cpu usage is around 850%(9 vcpus) and if it is set to 4 
> then the cpu usage is around 450%(5 vcpus). Using the same kafka 
> server.properties for both.
> Did further analysis with git bisect, couple of build and deploys, traced the 
> issue to commit 47ee8e954df62b9a79099e944ec4be29afe046f6. CPU usage is fine 
> for commit f15cdbc91b240e656d9a2aeb6877e94624b21f8d. But with commit 
> 47ee8e954df62b9a79099e944ec4be29afe046f6 cpu usage has increased. Have 
> attached screenshots of profiling done with both the commits. Screenshot 
> Commit-f15cdbc91b-profile shows less cpu usage by network threads and 
> Screenshots Commit-47ee8e954-profile and Commit-47ee8e954-profile2 show 
> higher cpu usage(almost entire cpu usage) by network threads. Also noticed 
> that kafka.network.Processor.poll() method is invoked 10 times more with 
> commit 47ee8e954df62b9a79099e944ec4be29afe046f6.
> We need the issue to be resolved to upgrade the cluster. Please let me know 
> if you need any additional information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7060) Command-line overrides for ConnectDistributed worker properties

2018-06-15 Thread Kevin Lafferty (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513922#comment-16513922
 ] 

Kevin Lafferty commented on KAFKA-7060:
---

[~enether] thanks, I've updated to Minor / Improvement.

> Command-line overrides for ConnectDistributed worker properties
> ---
>
> Key: KAFKA-7060
> URL: https://issues.apache.org/jira/browse/KAFKA-7060
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Kevin Lafferty
>Priority: Minor
>
> This Jira is for tracking the implementation for 
> [KIP-316|https://cwiki.apache.org/confluence/display/KAFKA/KIP-316%3A+Command-line+overrides+for+ConnectDistributed+worker+properties].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7060) Command-line overrides for ConnectDistributed worker properties

2018-06-15 Thread Kevin Lafferty (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Lafferty updated KAFKA-7060:
--
  Priority: Minor  (was: Major)
Issue Type: Improvement  (was: Bug)

> Command-line overrides for ConnectDistributed worker properties
> ---
>
> Key: KAFKA-7060
> URL: https://issues.apache.org/jira/browse/KAFKA-7060
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Kevin Lafferty
>Priority: Minor
>
> This Jira is for tracking the implementation for 
> [KIP-316|https://cwiki.apache.org/confluence/display/KAFKA/KIP-316%3A+Command-line+overrides+for+ConnectDistributed+worker+properties].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7012) Performance issue upgrading to kafka 1.0.1 or 1.1

2018-06-15 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7012:
---
Labels: regression  (was: )

> Performance issue upgrading to kafka 1.0.1 or 1.1
> -
>
> Key: KAFKA-7012
> URL: https://issues.apache.org/jira/browse/KAFKA-7012
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.1.0, 1.0.1
>Reporter: rajadayalan perumalsamy
>Assignee: praveen
>Priority: Major
>  Labels: regression
> Attachments: Commit-47ee8e954-0607-bufferkeys-nopoll-profile.png, 
> Commit-47ee8e954-0607-memory.png, Commit-47ee8e954-0607-profile.png, 
> Commit-47ee8e954-profile.png, Commit-47ee8e954-profile2.png, 
> Commit-f15cdbc91b-profile.png, Commit-f15cdbc91b-profile2.png
>
>
> We are trying to upgrade kafka cluster from Kafka 0.11.0.1 to Kafka 1.0.1. 
> After upgrading 1 node on the cluster, we notice that network threads use 
> most of the cpu. It is a 3 node cluster with 15k messages/sec on each node. 
> With Kafka 0.11.0.1 typical usage of the servers is around 50 to 60% 
> vcpu(using less than 1 vcpu). After upgrade we are noticing that cpu usage is 
> high depending on the number of network threads used. If networks threads is 
> set to 8, then the cpu usage is around 850%(9 vcpus) and if it is set to 4 
> then the cpu usage is around 450%(5 vcpus). Using the same kafka 
> server.properties for both.
> Did further analysis with git bisect, couple of build and deploys, traced the 
> issue to commit 47ee8e954df62b9a79099e944ec4be29afe046f6. CPU usage is fine 
> for commit f15cdbc91b240e656d9a2aeb6877e94624b21f8d. But with commit 
> 47ee8e954df62b9a79099e944ec4be29afe046f6 cpu usage has increased. Have 
> attached screenshots of profiling done with both the commits. Screenshot 
> Commit-f15cdbc91b-profile shows less cpu usage by network threads and 
> Screenshots Commit-47ee8e954-profile and Commit-47ee8e954-profile2 show 
> higher cpu usage(almost entire cpu usage) by network threads. Also noticed 
> that kafka.network.Processor.poll() method is invoked 10 times more with 
> commit 47ee8e954df62b9a79099e944ec4be29afe046f6.
> We need the issue to be resolved to upgrade the cluster. Please let me know 
> if you need any additional information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7012) Performance issue upgrading to kafka 1.0.1 or 1.1

2018-06-15 Thread Ismael Juma (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513910#comment-16513910
 ] 

Ismael Juma commented on KAFKA-7012:


[~rsivaram] [~junrao], what are your thoughts on this one? It's a performance 
regression so it would be good to address.

> Performance issue upgrading to kafka 1.0.1 or 1.1
> -
>
> Key: KAFKA-7012
> URL: https://issues.apache.org/jira/browse/KAFKA-7012
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.1.0, 1.0.1
>Reporter: rajadayalan perumalsamy
>Assignee: praveen
>Priority: Major
>  Labels: regression
> Attachments: Commit-47ee8e954-0607-bufferkeys-nopoll-profile.png, 
> Commit-47ee8e954-0607-memory.png, Commit-47ee8e954-0607-profile.png, 
> Commit-47ee8e954-profile.png, Commit-47ee8e954-profile2.png, 
> Commit-f15cdbc91b-profile.png, Commit-f15cdbc91b-profile2.png
>
>
> We are trying to upgrade kafka cluster from Kafka 0.11.0.1 to Kafka 1.0.1. 
> After upgrading 1 node on the cluster, we notice that network threads use 
> most of the cpu. It is a 3 node cluster with 15k messages/sec on each node. 
> With Kafka 0.11.0.1 typical usage of the servers is around 50 to 60% 
> vcpu(using less than 1 vcpu). After upgrade we are noticing that cpu usage is 
> high depending on the number of network threads used. If networks threads is 
> set to 8, then the cpu usage is around 850%(9 vcpus) and if it is set to 4 
> then the cpu usage is around 450%(5 vcpus). Using the same kafka 
> server.properties for both.
> Did further analysis with git bisect, couple of build and deploys, traced the 
> issue to commit 47ee8e954df62b9a79099e944ec4be29afe046f6. CPU usage is fine 
> for commit f15cdbc91b240e656d9a2aeb6877e94624b21f8d. But with commit 
> 47ee8e954df62b9a79099e944ec4be29afe046f6 cpu usage has increased. Have 
> attached screenshots of profiling done with both the commits. Screenshot 
> Commit-f15cdbc91b-profile shows less cpu usage by network threads and 
> Screenshots Commit-47ee8e954-profile and Commit-47ee8e954-profile2 show 
> higher cpu usage(almost entire cpu usage) by network threads. Also noticed 
> that kafka.network.Processor.poll() method is invoked 10 times more with 
> commit 47ee8e954df62b9a79099e944ec4be29afe046f6.
> We need the issue to be resolved to upgrade the cluster. Please let me know 
> if you need any additional information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-7062) Simplify MirrorMaker loop after removal of old consumer support

2018-06-15 Thread Andras Beni (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Beni reassigned KAFKA-7062:
--

Assignee: Andras Beni

> Simplify MirrorMaker loop after removal of old consumer support
> ---
>
> Key: KAFKA-7062
> URL: https://issues.apache.org/jira/browse/KAFKA-7062
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Andras Beni
>Priority: Minor
>  Labels: newbie
>
> Once KAFKA-2983 is merged, we can simplify the MirrorMaker loop to be a 
> single loop instead of two nested loops. In the old consumer, even if there 
> is no message offsets would still be committed so receive() could block. The 
> new consumer doesn't have this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7063) Update documentation to remove references to old producers and consumers

2018-06-15 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-7063:
--

 Summary: Update documentation to remove references to old 
producers and consumers
 Key: KAFKA-7063
 URL: https://issues.apache.org/jira/browse/KAFKA-7063
 Project: Kafka
  Issue Type: Improvement
Reporter: Ismael Juma


We should also remove any mention of "new consumer" or "new producer". They 
should just be "producer" and "consumer".

Finally, any mention of "Scala producer/consumer/client" should also be removed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7062) Simplify MirrorMaker loop after removal of old consumer support

2018-06-15 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7062:
---
Description: Once KAFKA-2983 is merged, we can simplify the MirrorMaker 
loop to be a single loop instead of two nested loops. In the old consumer, even 
if there is no message offsets would still be committed so receive() could 
block. The new consumer doesn't have this issue.  (was: Once KAFKA-2983 is 
merged, we can simplify the MirrorMaker loop to be a single loop instead of two 
loops. In the old consumer, even if there is no message offsets would still be 
committed so receive() could block. The new consumer doesn't have this issue.)

> Simplify MirrorMaker loop after removal of old consumer support
> ---
>
> Key: KAFKA-7062
> URL: https://issues.apache.org/jira/browse/KAFKA-7062
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Priority: Minor
>  Labels: newbie
>
> Once KAFKA-2983 is merged, we can simplify the MirrorMaker loop to be a 
> single loop instead of two nested loops. In the old consumer, even if there 
> is no message offsets would still be committed so receive() could block. The 
> new consumer doesn't have this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7062) Simplify MirrorMaker loop after removal of old consumer support

2018-06-15 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-7062:
--

 Summary: Simplify MirrorMaker loop after removal of old consumer 
support
 Key: KAFKA-7062
 URL: https://issues.apache.org/jira/browse/KAFKA-7062
 Project: Kafka
  Issue Type: Improvement
Reporter: Ismael Juma


Once KAFKA-2983 is merged, we can simplify the MirrorMaker loop to be a single 
loop instead of two loops. In the old consumer, even if there is no message 
offsets would still be committed so receive() could block. The new consumer 
doesn't have this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-7048) NPE when creating connector

2018-06-15 Thread Rajini Sivaram (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513745#comment-16513745
 ] 

Rajini Sivaram edited comment on KAFKA-7048 at 6/15/18 2:12 PM:


[~rhauch] [~ryokota] Is this a blocker for 2.0.0? If so, can one of you review 
the PR?


was (Author: rsivaram):
[~rhauch] [~ryokota Is this a blocker for 2.0.0? If so, can one of you review 
the PR?

> NPE when creating connector
> ---
>
> Key: KAFKA-7048
> URL: https://issues.apache.org/jira/browse/KAFKA-7048
> Project: Kafka
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Blocker
> Fix For: 2.0.0
>
>
> KAFKA-6886 introduced the ConfigTransformer to transform the given 
> configuration data. ConfigTransformer#transform(Map) expect 
> the passed config won't be null but DistributedHerder#putConnectorConfig call 
> the #transform before updating the snapshot (see below). Hence, it cause the 
> NPE. 
> {code:java}
> // Note that we use the updated connector config despite the fact that we 
> don't have an updated
> // snapshot yet. The existing task info should still be accurate.
> Map map = configState.connectorConfig(connName);
> ConnectorInfo info = new ConnectorInfo(connName, config, 
> configState.tasks(connName),
> map == null ? null : 
> connectorTypeForClass(map.get(ConnectorConfig.CONNECTOR_CLASS_CONFIG)));
> callback.onCompletion(null, new Created<>(!exists, info));
> return null;{code}
> We can add a null check to "configs" (see below) to resolve the NPE. It means 
> we WON'T pass the null configs to configTransformer
> {code:java}
> public Map connectorConfig(String connector) {
> Map configs = connectorConfigs.get(connector);
> if (configTransformer != null) { // add a condition "configs != null"
> configs = configTransformer.transform(connector, configs);
> }
> return configs;
> }{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7055) Kafka Streams Processor API allows you to add sinks and processors without parent

2018-06-15 Thread Nikki Thean (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikki Thean updated KAFKA-7055:
---
Labels:   (was: easyfix)

> Kafka Streams Processor API allows you to add sinks and processors without 
> parent
> -
>
> Key: KAFKA-7055
> URL: https://issues.apache.org/jira/browse/KAFKA-7055
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Nikki Thean
>Assignee: Nikki Thean
>Priority: Minor
> Fix For: 2.0.0
>
>
> The Kafka Streams Processor API allows you to define a Topology and connect 
> sources, processors, and sinks. From reading through the code, it seems that 
> you cannot forward a message to a downstream node unless it is explicitly 
> connected to the upstream node (from which you are forwarding the message) as 
> a child. Here is an example where you forward using name of downstream node 
> rather than child index 
> ([https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/ProcessorContextImpl.java#L117]).
> However, I've been able to connect processors and sinks to the topology 
> without including parent names, i.e with empty vararg, using this method: 
> [https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/Topology.java#L423].
> As any attempt to forward a message to those nodes will throw a 
> StreamsException, I suggest throwing an exception if a processor or sink is 
> added without at least one upstream node. There is a method in 
> `InternalTopologyBuilder` that allows you to connect processors by name after 
> you add them to the topology, but it is not part of the external Processor 
> API.
> In addition (or alternatively), I suggest making [the error message for when 
> users try to forward messages to a node that is not 
> connected|https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/ProcessorContextImpl.java#L119]
>  more descriptive, like this one for when a user attempts to access a state 
> store that is not connected to the processor: 
> [https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/ProcessorContextImpl.java#L75-L81]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7048) NPE when creating connector

2018-06-15 Thread Rajini Sivaram (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513745#comment-16513745
 ] 

Rajini Sivaram commented on KAFKA-7048:
---

[~rhauch] [~ryokota Is this a blocker for 2.0.0? If so, can one of you review 
the PR?

> NPE when creating connector
> ---
>
> Key: KAFKA-7048
> URL: https://issues.apache.org/jira/browse/KAFKA-7048
> Project: Kafka
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Blocker
> Fix For: 2.0.0
>
>
> KAFKA-6886 introduced the ConfigTransformer to transform the given 
> configuration data. ConfigTransformer#transform(Map) expect 
> the passed config won't be null but DistributedHerder#putConnectorConfig call 
> the #transform before updating the snapshot (see below). Hence, it cause the 
> NPE. 
> {code:java}
> // Note that we use the updated connector config despite the fact that we 
> don't have an updated
> // snapshot yet. The existing task info should still be accurate.
> Map map = configState.connectorConfig(connName);
> ConnectorInfo info = new ConnectorInfo(connName, config, 
> configState.tasks(connName),
> map == null ? null : 
> connectorTypeForClass(map.get(ConnectorConfig.CONNECTOR_CLASS_CONFIG)));
> callback.onCompletion(null, new Created<>(!exists, info));
> return null;{code}
> We can add a null check to "configs" (see below) to resolve the NPE. It means 
> we WON'T pass the null configs to configTransformer
> {code:java}
> public Map connectorConfig(String connector) {
> Map configs = connectorConfigs.get(connector);
> if (configTransformer != null) { // add a condition "configs != null"
> configs = configTransformer.transform(connector, configs);
> }
> return configs;
> }{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7032) The TimeUnit is neglected by KakfaConsumer#close(long, TimeUnit)

2018-06-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513676#comment-16513676
 ] 

ASF GitHub Bot commented on KAFKA-7032:
---

rajinisivaram closed pull request #5182: KAFKA-7032 The TimeUnit is neglected 
by KakfaConsumer#close(long, Tim…
URL: https://github.com/apache/kafka/pull/5182
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
b/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java
index 76e0fcc9ba6..342c559c500 100644
--- a/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java
+++ b/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java
@@ -2081,7 +2081,7 @@ public void close() {
 @Deprecated
 @Override
 public void close(long timeout, TimeUnit timeUnit) {
-close(Duration.ofMillis(TimeUnit.MILLISECONDS.toMillis(timeout)));
+close(Duration.ofMillis(timeUnit.toMillis(timeout)));
 }
 
 /**
diff --git 
a/clients/src/test/java/org/apache/kafka/clients/consumer/KafkaConsumerTest.java
 
b/clients/src/test/java/org/apache/kafka/clients/consumer/KafkaConsumerTest.java
index 97ec08209aa..316404b39e2 100644
--- 
a/clients/src/test/java/org/apache/kafka/clients/consumer/KafkaConsumerTest.java
+++ 
b/clients/src/test/java/org/apache/kafka/clients/consumer/KafkaConsumerTest.java
@@ -44,10 +44,10 @@
 import org.apache.kafka.common.record.MemoryRecords;
 import org.apache.kafka.common.record.MemoryRecordsBuilder;
 import org.apache.kafka.common.record.TimestampType;
-import org.apache.kafka.common.requests.FetchResponse;
 import org.apache.kafka.common.requests.AbstractRequest;
 import org.apache.kafka.common.requests.AbstractResponse;
 import org.apache.kafka.common.requests.FetchRequest;
+import org.apache.kafka.common.requests.FetchResponse;
 import org.apache.kafka.common.requests.FindCoordinatorResponse;
 import org.apache.kafka.common.requests.HeartbeatResponse;
 import org.apache.kafka.common.requests.IsolationLevel;
@@ -71,6 +71,7 @@
 import org.apache.kafka.test.MockMetricsReporter;
 import org.apache.kafka.test.TestCondition;
 import org.apache.kafka.test.TestUtils;
+import org.easymock.EasyMock;
 import org.junit.Assert;
 import org.junit.Rule;
 import org.junit.Test;
@@ -1811,7 +1812,7 @@ private FetchResponse fetchResponse(TopicPartition 
partition, long fetchOffset,
 requestTimeoutMs,
 IsolationLevel.READ_UNCOMMITTED);
 
-return new KafkaConsumer<>(
+return new KafkaConsumer(
 loggerFactory,
 clientId,
 consumerCoordinator,
@@ -1839,4 +1840,15 @@ private FetchResponse fetchResponse(TopicPartition 
partition, long fetchOffset,
 this.count = count;
 }
 }
+
+@Test
+public void testCloseWithTimeUnit() {
+KafkaConsumer consumer = 
EasyMock.partialMockBuilder(KafkaConsumer.class)
+.addMockedMethod("close", Duration.class).createMock();
+consumer.close(Duration.ofSeconds(1));
+EasyMock.expectLastCall();
+EasyMock.replay(consumer);
+consumer.close(1, TimeUnit.SECONDS);
+EasyMock.verify(consumer);
+}
 }


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> The TimeUnit is neglected by KakfaConsumer#close(long, TimeUnit)
> 
>
> Key: KAFKA-7032
> URL: https://issues.apache.org/jira/browse/KAFKA-7032
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 2.0.0
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Blocker
> Fix For: 2.0.0
>
>
> {code:java}
> @Deprecated
> @Override
> public void close(long timeout, TimeUnit timeUnit) {
> close(Duration.ofMillis(TimeUnit.MILLISECONDS.toMillis(timeout)));
> }{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7060) Command-line overrides for ConnectDistributed worker properties

2018-06-15 Thread Stanislav Kozlovski (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513581#comment-16513581
 ] 

Stanislav Kozlovski commented on KAFKA-7060:


Love this change, the only concern I have is that this might not be correctly 
classified as "Major Priority" nor is it necessarily considered a "Bug".

> Command-line overrides for ConnectDistributed worker properties
> ---
>
> Key: KAFKA-7060
> URL: https://issues.apache.org/jira/browse/KAFKA-7060
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Kevin Lafferty
>Priority: Major
>
> This Jira is for tracking the implementation for 
> [KIP-316|https://cwiki.apache.org/confluence/display/KAFKA/KIP-316%3A+Command-line+overrides+for+ConnectDistributed+worker+properties].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7061) Enhanced log compaction

2018-06-15 Thread Luis Cabral (JIRA)
Luis Cabral created KAFKA-7061:
--

 Summary: Enhanced log compaction
 Key: KAFKA-7061
 URL: https://issues.apache.org/jira/browse/KAFKA-7061
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 2.0.0
Reporter: Luis Cabral
 Fix For: 2.1.0


Enhance log compaction to support more than just offset comparison, so the 
insertion order isn't dictating which records to keep.

Default behavior is kept as it was, with the enhanced approached having to be 
purposely activated.
The enhanced compaction is done either via the record timestamp, by settings 
the new configuration as "timestamp" or via the record headers by setting this 
configuration to anything other than the default "offset" or the reserved 
"timestamp".

See 
[KIP-280|https://cwiki.apache.org/confluence/display/KAFKA/KIP-280%3A+Enhanced+log+compaction]
 for more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-5975) No response when deleting topics and delete.topic.enable=false

2018-06-15 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5975:
---
Fix Version/s: (was: 1.0.2)

> No response when deleting topics and delete.topic.enable=false
> --
>
> Key: KAFKA-5975
> URL: https://issues.apache.org/jira/browse/KAFKA-5975
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.11.0.0
>Reporter: Mario Molina
>Assignee: Mario Molina
>Priority: Minor
> Fix For: 1.1.1
>
>
> When trying to delete topics using the KafkaAdminClient and the flag in 
> server config is set as 'delete.topic.enable=false', the client cannot get a 
> response and fails returning a timeout error. This is due to the object 
> DelayedCreatePartitions cannot complete the operation.
> This bug fix modifies the KafkaApi key DELETE_TOPICS taking into account that 
> the flag can be disabled and swallow the error to the client, this is, the 
> topic is never removed and no error is returned to the client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-4950) ConcurrentModificationException when iterating over Kafka Metrics

2018-06-15 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-4950:
---
Fix Version/s: (was: 1.0.2)

> ConcurrentModificationException when iterating over Kafka Metrics
> -
>
> Key: KAFKA-4950
> URL: https://issues.apache.org/jira/browse/KAFKA-4950
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.1.1
>Reporter: Dumitru Postoronca
>Assignee: Sébastien Launay
>Priority: Minor
> Fix For: 2.1.0
>
>
> It looks like the when calling {{PartitionStates.partitionSet()}}, while the 
> resulting Hashmap is being built, the internal state of the allocations can 
> change, which leads to ConcurrentModificationException during the copy 
> operation.
> {code}
> java.util.ConcurrentModificationException
> at 
> java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719)
> at 
> java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:742)
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:119)
> at 
> org.apache.kafka.common.internals.PartitionStates.partitionSet(PartitionStates.java:66)
> at 
> org.apache.kafka.clients.consumer.internals.SubscriptionState.assignedPartitions(SubscriptionState.java:291)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$ConsumerCoordinatorMetrics$1.measure(ConsumerCoordinator.java:783)
> at 
> org.apache.kafka.common.metrics.KafkaMetric.value(KafkaMetric.java:61)
> at 
> org.apache.kafka.common.metrics.KafkaMetric.value(KafkaMetric.java:52)
> {code}
> {code}
> // client code:
> import java.util.Collections;
> import java.util.HashMap;
> import java.util.Map;
> import com.codahale.metrics.Gauge;
> import com.codahale.metrics.Metric;
> import com.codahale.metrics.MetricSet;
> import org.apache.kafka.clients.consumer.KafkaConsumer;
> import org.apache.kafka.common.MetricName;
> import static com.codahale.metrics.MetricRegistry.name;
> public class KafkaMetricSet implements MetricSet {
> private final KafkaConsumer client;
> public KafkaMetricSet(KafkaConsumer client) {
> this.client = client;
> }
> @Override
> public Map getMetrics() {
> final Map gauges = new HashMap();
> Map m = client.metrics();
> for (Map.Entry e : 
> m.entrySet()) {
> gauges.put(name(e.getKey().group(), e.getKey().name(), "count"), 
> new Gauge() {
> @Override
> public Double getValue() {
> return e.getValue().value(); // exception thrown here 
> }
> });
> }
> return Collections.unmodifiableMap(gauges);
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-5973) ShutdownableThread catching errors can lead to partial hard to diagnose broker failure

2018-06-15 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5973:
---
Fix Version/s: (was: 1.0.2)

> ShutdownableThread catching errors can lead to partial hard to diagnose 
> broker failure
> --
>
> Key: KAFKA-5973
> URL: https://issues.apache.org/jira/browse/KAFKA-5973
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.11.0.0, 0.11.0.1
>Reporter: Tom Crayford
>Priority: Major
> Attachments: 5973.v1.txt
>
>
> When any kafka broker {{ShutdownableThread}} subclasses crashes due to an
> uncaught exception, the broker is left running in a very weird/bad state with 
> some
> threads not running, but potentially the broker can still be serving traffic 
> to
> users but not performing its usual operations.
> This is problematic, because monitoring may say that "the broker is up and 
> fine", but in fact it is not healthy.
> At Heroku we've been mitigating this by monitoring all threads that "should" 
> be
> running on a broker and alerting when a given thread isn't running for some
> reason.
> Things that use {{ShutdownableThread}} that can crash and leave a broker/the 
> controller in a bad state:
> - log cleaner
> - replica fetcher threads
> - controller to broker send threads
> - controller topic deletion threads
> - quota throttling reapers
> - io threads
> - network threads
> - group metadata management threads
> Some of these can have disasterous consequences, and nearly all of them 
> crashing for any reason is a cause for alert.
> But, users probably shouldn't have to know about all the internals of Kafka 
> and run thread dumps periodically as part of normal operations.
> There are a few potential options here:
> 1. On the crash of any {{ShutdownableThread}}, shutdown the whole broker 
> process
> We could crash the whole broker when an individual thread dies. I think this 
> is pretty reasonable, it's better to have a very visible breakage than a very 
> hard to detect one.
> 2. Add some healthcheck JMX bean to detect these thread crashes
> Users having to audit all of Kafka's source code on each new release and 
> track a list of "threads that should be running" is... pretty silly. We could 
> instead expose a JMX bean of some kind indicating threads that died due to 
> uncaught exceptions
> 3. Do nothing, but add documentation around monitoring/logging that exposes 
> this error
> These thread deaths *do* emit log lines, but it's not that clear or obvious 
> to users they need to monitor and alert on them. The project could add 
> documentation



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-5445) Document exceptions thrown by AdminClient methods

2018-06-15 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5445:
---
Fix Version/s: (was: 1.0.2)

> Document exceptions thrown by AdminClient methods
> -
>
> Key: KAFKA-5445
> URL: https://issues.apache.org/jira/browse/KAFKA-5445
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin, clients
>Reporter: Ismael Juma
>Assignee: Andrey Dyachkov
>Priority: Major
> Fix For: 2.1.0
>
>
> AdminClient should document the exceptions that users may have to handle.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-4686) Null Message payload is shutting down broker

2018-06-15 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-4686:
---
Fix Version/s: (was: 1.0.2)

> Null Message payload is shutting down broker
> 
>
> Key: KAFKA-4686
> URL: https://issues.apache.org/jira/browse/KAFKA-4686
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
> Environment: Amazon Linux AMI release 2016.03 kernel 
> 4.4.19-29.55.amzn1.x86_64
>Reporter: Rodrigo Queiroz Saramago
>Priority: Critical
> Attachments: KAFKA-4686-NullMessagePayloadError.tar.xz, 
> kafkaServer.out
>
>
> Hello, I have a test environment with 3 brokers and 1 zookeeper nodes, in 
> which clients connect using two-way ssl authentication. I use kafka version 
> 0.10.1.1, the system works as expected for a while, but if the broker goes 
> down and then is restarted, something got corrupted and is not possible start 
> broker again, it always fails with the same error. What this error mean? What 
> can I do in this case? Is this the expected behavior?
> [2017-01-23 07:03:28,927] ERROR There was an error in one of the threads 
> during logs loading: kafka.common.KafkaException: Message payload is null: 
> Message(magic = 0, attributes = 1, crc = 4122289508, key = null, payload = 
> null) (kafka.log.LogManager)
> [2017-01-23 07:03:28,929] FATAL Fatal error during KafkaServer startup. 
> Prepare to shutdown (kafka.server.KafkaServer)
> kafka.common.KafkaException: Message payload is null: Message(magic = 0, 
> attributes = 1, crc = 4122289508, key = null, payload = null)
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.(ByteBufferMessageSet.scala:90)
> at 
> kafka.message.ByteBufferMessageSet$.deepIterator(ByteBufferMessageSet.scala:85)
> at kafka.message.MessageAndOffset.firstOffset(MessageAndOffset.scala:33)
> at kafka.log.LogSegment.recover(LogSegment.scala:223)
> at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:218)
> at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:179)
> at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
> at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
> at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
> at kafka.log.Log.loadSegments(Log.scala:179)
> at kafka.log.Log.(Log.scala:108)
> at 
> kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:151)
> at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:58)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> [2017-01-23 07:03:28,946] INFO shutting down (kafka.server.KafkaServer)
> [2017-01-23 07:03:28,949] INFO Terminate ZkClient event thread. 
> (org.I0Itec.zkclient.ZkEventThread)
> [2017-01-23 07:03:28,954] INFO EventThread shut down for session: 
> 0x159bd458ae70008 (org.apache.zookeeper.ClientCnxn)
> [2017-01-23 07:03:28,954] INFO Session: 0x159bd458ae70008 closed 
> (org.apache.zookeeper.ZooKeeper)
> [2017-01-23 07:03:28,957] INFO shut down completed (kafka.server.KafkaServer)
> [2017-01-23 07:03:28,959] FATAL Fatal error during KafkaServerStartable 
> startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
> kafka.common.KafkaException: Message payload is null: Message(magic = 0, 
> attributes = 1, crc = 4122289508, key = null, payload = null)
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.(ByteBufferMessageSet.scala:90)
> at 
> kafka.message.ByteBufferMessageSet$.deepIterator(ByteBufferMessageSet.scala:85)
> at kafka.message.MessageAndOffset.firstOffset(MessageAndOffset.scala:33)
> at kafka.log.LogSegment.recover(LogSegment.scala:223)
> at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:218)
> at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:179)
> at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
> at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
> at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
> at kafka.log.Log.loadSegments(Log.scala:179)
> at kafka.log.Log.(Log.scala:108)
> at 
> 

[jira] [Updated] (KAFKA-6083) The Fetcher should add the InvalidRecordException as a cause to the KafkaException when invalid record is found.

2018-06-15 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-6083:
---
Fix Version/s: (was: 1.0.2)

> The Fetcher should add the InvalidRecordException as a cause to the 
> KafkaException when invalid record is found.
> 
>
> Key: KAFKA-6083
> URL: https://issues.apache.org/jira/browse/KAFKA-6083
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Affects Versions: 1.0.0
>Reporter: Jiangjie Qin
>Assignee: Adem Efe Gencer
>Priority: Major
>  Labels: newbie++
>
> In the Fetcher, when there is an InvalidRecoredException thrown, we will 
> convert it to a KafkaException, we should also add the InvalidRecordException 
> to it as the cause.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-4972) Kafka 0.10.0 Found a corrupted index file during Kafka broker startup

2018-06-15 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-4972:
---
Fix Version/s: (was: 1.0.2)

> Kafka 0.10.0  Found a corrupted index file during Kafka broker startup
> --
>
> Key: KAFKA-4972
> URL: https://issues.apache.org/jira/browse/KAFKA-4972
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.0.0
> Environment: JDK: HotSpot  x64  1.7.0_80
> Tag: 0.10.0
>Reporter: fangjinuo
>Priority: Critical
>  Labels: reliability
> Attachments: Snap3.png
>
>
> -deleted text-After force shutdown all kafka brokers one by one, restart them 
> one by one, but a broker startup failure.
> The following WARN leval log was found in the log file:
> found a corrutped index file,  .index , delet it  ...
> you can view details by following attachment.
> ~I look up some codes in core module, found out :
> the nonthreadsafe method LogSegment.append(offset, messages)  has tow caller:
> 1) Log.append(messages)  // here has a synchronized 
> lock 
> 2) LogCleaner.cleanInto(topicAndPartition, source, dest, map, retainDeletes, 
> messageFormatVersion)   // here has not 
> So I guess this may be the reason for the repeated offset in 0xx.log file 
> (logsegment's .log) ~
> Although this is just my inference, but I hope that this problem can be 
> quickly repaired



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-5474) Streams StandbyTask should no checkpoint on commit if EOS is enabled

2018-06-15 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5474:
---
Affects Version/s: (was: 1.0.0)
   0.11.0.0

> Streams StandbyTask should no checkpoint on commit if EOS is enabled
> 
>
> Key: KAFKA-5474
> URL: https://issues.apache.org/jira/browse/KAFKA-5474
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.1, 1.0.0
>
>
> Discovered by system test {{streams_eos_test#test_failure_and_recovery}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-5474) Streams StandbyTask should no checkpoint on commit if EOS is enabled

2018-06-15 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5474:
---
Fix Version/s: 0.11.0.1
   1.0.0

> Streams StandbyTask should no checkpoint on commit if EOS is enabled
> 
>
> Key: KAFKA-5474
> URL: https://issues.apache.org/jira/browse/KAFKA-5474
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.1, 1.0.0
>
>
> Discovered by system test {{streams_eos_test#test_failure_and_recovery}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)