[jira] [Commented] (KAFKA-5153) KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting

2017-06-26 Thread Mahesh Veerabathiran (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064176#comment-16064176
 ] 

Mahesh Veerabathiran commented on KAFKA-5153:
-

Having the same issue. do anyone has a resolution?

> KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting
> ---
>
> Key: KAFKA-5153
> URL: https://issues.apache.org/jira/browse/KAFKA-5153
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
> Environment: RHEL 6
> Java Version  1.8.0_91-b14
>Reporter: Arpan
>Priority: Critical
> Attachments: server_1_72server.log, server_2_73_server.log, 
> server_3_74Server.log, server.properties, ThreadDump_1493564142.dump, 
> ThreadDump_1493564177.dump, ThreadDump_1493564249.dump
>
>
> Hi Team, 
> I was earlier referring to issue KAFKA-4477 because the problem i am facing 
> is similar. I tried to search the same reference in release docs as well but 
> did not get anything in 0.10.1.1 or 0.10.2.0. I am currently using 
> 2.11_0.10.2.0.
> I am have 3 node cluster for KAFKA and cluster for ZK as well on the same set 
> of servers in cluster mode. We are having around 240GB of data getting 
> transferred through KAFKA everyday. What we are observing is disconnect of 
> the server from cluster and ISR getting reduced and it starts impacting 
> service.
> I have also observed file descriptor count getting increased a bit, in normal 
> circumstances we have not observed FD count more than 500 but when issue 
> started we were observing it in the range of 650-700 on all 3 servers. 
> Attaching thread dumps of all 3 servers when we started facing the issue 
> recently.
> The issue get vanished once you bounce the nodes and the set up is not 
> working more than 5 days without this issue. Attaching server logs as well.
> Kindly let me know if you need any additional information. Attaching 
> server.properties as well for one of the server (It's similar on all 3 
> serversP)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5522) ListOffset should take LSO into account when searching by timestamp

2017-06-26 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-5522:
--

 Summary: ListOffset should take LSO into account when searching by 
timestamp
 Key: KAFKA-5522
 URL: https://issues.apache.org/jira/browse/KAFKA-5522
 Project: Kafka
  Issue Type: Sub-task
Affects Versions: 0.11.0.0
Reporter: Jason Gustafson
Assignee: Jason Gustafson
 Fix For: 0.11.0.1


For a normal read_uncommitted consumer, we bound the offset returned from 
ListOffsets by the high watermark. For read_committed consumers, we should 
similarly bound offsets by the LSO. Currently we only handle the case of 
fetching the end offset.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5521) Support replicas movement between log directories (KIP-113)

2017-06-26 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-5521:
---

 Summary: Support replicas movement between log directories 
(KIP-113)
 Key: KAFKA-5521
 URL: https://issues.apache.org/jira/browse/KAFKA-5521
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Dong Lin






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5367) Producer should not expiry topic from metadata cache if accumulator still has data for this topic

2017-06-26 Thread Dong Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin resolved KAFKA-5367.
-
Resolution: Invalid

> Producer should not expiry topic from metadata cache if accumulator still has 
> data for this topic
> -
>
> Key: KAFKA-5367
> URL: https://issues.apache.org/jira/browse/KAFKA-5367
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
>
> To be added.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4750) KeyValueIterator returns null values

2017-06-26 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063956#comment-16063956
 ] 

Matthias J. Sax commented on KAFKA-4750:


I think [~damianguy] or [~enothereska] can comment best in this. AFAIK, we use 
{{put(key,null)}} with delete-semantics all over the place. Also for {{KTable}} 
caches. As it align with changelog delete semantics I also think it does make 
sense to keep it this way. I would rather educate user that plug in Serde to 
not return {{null}} if input is not {{null}}. We can also add checks to all 
{{Serde}} calls: (1) never call Serde for {{null}} as we know it must be 
{{null}} anyway (2) if we call Serde with not-null, make sure it does not 
return {{null}} -- otherwise throw exception.

> KeyValueIterator returns null values
> 
>
> Key: KAFKA-4750
> URL: https://issues.apache.org/jira/browse/KAFKA-4750
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1
>Reporter: Michal Borowiecki
>Assignee: Evgeny Veretennikov
>  Labels: newbie
> Attachments: DeleteTest.java
>
>
> The API for ReadOnlyKeyValueStore.range method promises the returned iterator 
> will not return null values. However, after upgrading from 0.10.0.0 to 
> 0.10.1.1 we found null values are returned causing NPEs on our side.
> I found this happens after removing entries from the store and I found 
> resemblance to SAMZA-94 defect. The problem seems to be as it was there, when 
> deleting entries and having a serializer that does not return null when null 
> is passed in, the state store doesn't actually delete that key/value pair but 
> the iterator will return null value for that key.
> When I modified our serilizer to return null when null is passed in, the 
> problem went away. However, I believe this should be fixed in kafka streams, 
> perhaps with a similar approach as SAMZA-94.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5464) StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG

2017-06-26 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5464:
---
Fix Version/s: 0.11.1.0

> StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG
> --
>
> Key: KAFKA-5464
> URL: https://issues.apache.org/jira/browse/KAFKA-5464
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.10.2.1
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.11.1.0, 0.10.2.2, 0.11.0.1
>
>
> In {{StreamsKafkaClient}} we use an {{NetworkClient}} internally and call 
> {{poll}} using {{StreamsConfig.POLL_MS_CONFIG}} as timeout.
> However, {{StreamsConfig.POLL_MS_CONFIG}} is solely meant to be applied to 
> {{KafkaConsumer.poll()}} and it's incorrect to use it for the 
> {{NetworkClient}}. If the config is increased, this can lead to a infinite 
> rebalance and rebalance on the client side is increased and thus, the client 
> is not able to meet broker enforced timeouts anymore.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5464) StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG

2017-06-26 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5464:
---
Fix Version/s: 0.11.0.1
   0.10.2.2

> StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG
> --
>
> Key: KAFKA-5464
> URL: https://issues.apache.org/jira/browse/KAFKA-5464
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.10.2.1
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.10.2.2, 0.11.0.1
>
>
> In {{StreamsKafkaClient}} we use an {{NetworkClient}} internally and call 
> {{poll}} using {{StreamsConfig.POLL_MS_CONFIG}} as timeout.
> However, {{StreamsConfig.POLL_MS_CONFIG}} is solely meant to be applied to 
> {{KafkaConsumer.poll()}} and it's incorrect to use it for the 
> {{NetworkClient}}. If the config is increased, this can lead to a infinite 
> rebalance and rebalance on the client side is increased and thus, the client 
> is not able to meet broker enforced timeouts anymore.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5464) StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063937#comment-16063937
 ] 

ASF GitHub Bot commented on KAFKA-5464:
---

GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/3439

KAFKA-5464: StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka kafka-5464-streamskafkaclient-poll

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3439.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3439


commit 609ceeb0d546f02b9da2638545784f050dc9558a
Author: Matthias J. Sax 
Date:   2017-06-26T22:52:56Z

KAFKA-5464: StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG




> StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG
> --
>
> Key: KAFKA-5464
> URL: https://issues.apache.org/jira/browse/KAFKA-5464
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.10.2.1
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>
> In {{StreamsKafkaClient}} we use an {{NetworkClient}} internally and call 
> {{poll}} using {{StreamsConfig.POLL_MS_CONFIG}} as timeout.
> However, {{StreamsConfig.POLL_MS_CONFIG}} is solely meant to be applied to 
> {{KafkaConsumer.poll()}} and it's incorrect to use it for the 
> {{NetworkClient}}. If the config is increased, this can lead to a infinite 
> rebalance and rebalance on the client side is increased and thus, the client 
> is not able to meet broker enforced timeouts anymore.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3806) Adjust default values of log.retention.hours and offsets.retention.minutes

2017-06-26 Thread James Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063798#comment-16063798
 ] 

James Cheng commented on KAFKA-3806:


[~junrao], your example of "track when an offset is obsolete based on the 
activity of the group" is being tracked in 
https://issues.apache.org/jira/browse/KAFKA-4682

> Adjust default values of log.retention.hours and offsets.retention.minutes
> --
>
> Key: KAFKA-3806
> URL: https://issues.apache.org/jira/browse/KAFKA-3806
> Project: Kafka
>  Issue Type: Improvement
>  Components: config
>Affects Versions: 0.9.0.1, 0.10.0.0
>Reporter: Michal Turek
>Priority: Minor
>
> Combination of default values of log.retention.hours (168 hours = 7 days) and 
> offsets.retention.minutes (1440 minutes = 1 day) may be dangerous in special 
> cases. Offset retention should be always greater than log retention.
> We have observed the following scenario and issue:
> - Producing of data to a topic was disabled two days ago by producer update, 
> topic wasn't deleted.
> - Consumer consumed all data and properly committed offsets to Kafka.
> - Consumer made no more offset commits for that topic because there was no 
> more incoming data and there was nothing to confirm. (We have auto-commit 
> disabled, I'm not sure how behaves enabled auto-commit.)
> - After one day: Kafka cleared too old offsets according to 
> offsets.retention.minutes.
> - After two days: Long-term running consumer was restarted after update, it 
> didn't find any committed offsets for that topic since they were deleted by 
> offsets.retention.minutes so it started consuming from the beginning.
> - The messages were still in Kafka due to larger log.retention.hours, about 5 
> days of messages were read again.
> Known workaround to solve this issue:
> - Explicitly configure log.retention.hours and offsets.retention.minutes, 
> don't use defaults.
> Proposals:
> - Prolong default value of offsets.retention.minutes to be at least twice 
> larger than log.retention.hours.
> - Check these values during Kafka startup and log a warning if 
> offsets.retention.minutes is smaller than log.retention.hours.
> - Add a note to migration guide about differences between storing of offsets 
> in ZooKeeper and Kafka (http://kafka.apache.org/documentation.html#upgrade).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4849) Bug in KafkaStreams documentation

2017-06-26 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063739#comment-16063739
 ] 

Guozhang Wang commented on KAFKA-4849:
--

We have resolved all the reported issues in this JIRA except the one in web 
docs, which is covered in KAFKA-4705 already. Resolving this ticket for now.

> Bug in KafkaStreams documentation
> -
>
> Key: KAFKA-4849
> URL: https://issues.apache.org/jira/browse/KAFKA-4849
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Seweryn Habdank-Wojewodzki
>Assignee: Matthias J. Sax
>Priority: Minor
>
> At the page: https://kafka.apache.org/documentation/streams
>  
> In the chapter titled Application Configuration and Execution, in the example 
> there is a line:
>  
> settings.put(StreamsConfig.ZOOKEEPER_CONNECT_CONFIG, "zookeeper1:2181");
>  
> but ZOOKEEPER_CONNECT_CONFIG is deprecated in the Kafka version 0.10.2.0.
>  
> Also the table on the page: 
> https://kafka.apache.org/0102/documentation/#streamsconfigs is a bit 
> misleading.
> 1. Again zookeeper.connect is deprecated.
> 2. The client.id and zookeeper.connect are marked by high importance, 
> but according to http://docs.confluent.io/3.2.0/streams/developer-guide.html 
> none of them are important to initialize the stream.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063725#comment-16063725
 ] 

ASF GitHub Bot commented on KAFKA-3465:
---

GitHub user vahidhashemian opened a pull request:

https://github.com/apache/kafka/pull/3438

KAFKA-3465: Clarify warning message of ConsumerOffsetChecker

Add that the tool works with the old consumer only.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vahidhashemian/kafka KAFKA-3465

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3438.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3438


commit fa68749267b40d156b3811368e2f49c5c8813a43
Author: Vahid Hashemian 
Date:   2017-06-26T20:24:46Z

KAFKA-3465: Clarify warning message of ConsumerOffsetChecker

Add that the tool works with the old consumer only.




> kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
> --
>
> Key: KAFKA-3465
> URL: https://issues.apache.org/jira/browse/KAFKA-3465
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: BrianLing
>
> 1. When we enable mirrorMake to migrate Kafka event from one to other with 
> "new.consumer" mode:
> java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 
> -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC 
> -Djava.awt.headless=true -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dkafka.logs.dir=/kafka/kafka-app-logs 
> -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties
>  -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* 
> -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log 
> --consumer.config ../config/consumer.properties --new.consumer --num.streams 
> 4 --producer.config ../config/producer-slca.properties --whitelist risk.*
> 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and 
> the owner is none.
> bin/kafka-run-class.sh  kafka.tools.ConsumerOffsetChecker --broker-info 
> --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 
> --topic 
> Group   Topic  Pid Offset  logSize
>  Lag Owner
> lvs.slca.mirrormaker   0   418578332   418678347   100015 
>  none
> lvs.slca.mirrormaker  1   418598026   418698338   100312  
> none
> [Root Cause]
> I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka 
> internal to store offset & consumer owner information. 
>   Does it mean we can not use the below command to check new consumer’s 
> lag since current lag formula: lag= logSize – offset 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80
>   
> https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182
>  => offSet Fetch from zookeeper instead of from Kafka



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5520) Extend Consumer Group Reset Offset tool for Stream Applications

2017-06-26 Thread Jorge Quilcate (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jorge Quilcate updated KAFKA-5520:
--
Description: KIP documentation: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application
  (was: KIP documentation: TODO)

> Extend Consumer Group Reset Offset tool for Stream Applications
> ---
>
> Key: KAFKA-5520
> URL: https://issues.apache.org/jira/browse/KAFKA-5520
> Project: Kafka
>  Issue Type: Improvement
>  Components: core, tools
>Reporter: Jorge Quilcate
>  Labels: kip
> Fix For: 0.11.1.0
>
>
> KIP documentation: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode

2017-06-26 Thread Vahid Hashemian (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063618#comment-16063618
 ] 

Vahid Hashemian commented on KAFKA-3465:


[~cmccabe] I have opened a PR to update the relevant documentation 
[here|https://github.com/apache/kafka/pull/3405]. Would that address your 
concern? Thanks. 

> kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
> --
>
> Key: KAFKA-3465
> URL: https://issues.apache.org/jira/browse/KAFKA-3465
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: BrianLing
>
> 1. When we enable mirrorMake to migrate Kafka event from one to other with 
> "new.consumer" mode:
> java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 
> -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC 
> -Djava.awt.headless=true -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dkafka.logs.dir=/kafka/kafka-app-logs 
> -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties
>  -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* 
> -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log 
> --consumer.config ../config/consumer.properties --new.consumer --num.streams 
> 4 --producer.config ../config/producer-slca.properties --whitelist risk.*
> 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and 
> the owner is none.
> bin/kafka-run-class.sh  kafka.tools.ConsumerOffsetChecker --broker-info 
> --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 
> --topic 
> Group   Topic  Pid Offset  logSize
>  Lag Owner
> lvs.slca.mirrormaker   0   418578332   418678347   100015 
>  none
> lvs.slca.mirrormaker  1   418598026   418698338   100312  
> none
> [Root Cause]
> I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka 
> internal to store offset & consumer owner information. 
>   Does it mean we can not use the below command to check new consumer’s 
> lag since current lag formula: lag= logSize – offset 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80
>   
> https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182
>  => offSet Fetch from zookeeper instead of from Kafka



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5519) Support for multiple certificates in a single keystore

2017-06-26 Thread Alla Tumarkin (JIRA)
Alla Tumarkin created KAFKA-5519:


 Summary: Support for multiple certificates in a single keystore
 Key: KAFKA-5519
 URL: https://issues.apache.org/jira/browse/KAFKA-5519
 Project: Kafka
  Issue Type: New Feature
  Components: security
Affects Versions: 0.10.2.1
Reporter: Alla Tumarkin


Background
Currently, we need to have a keystore exclusive to the component with exactly 
one key in it. Looking at the JSSE Reference guide, it seems like we would need 
to introduce our own KeyManager into the SSLContext which selects a 
configurable key alias name.
https://docs.oracle.com/javase/7/docs/api/javax/net/ssl/X509KeyManager.html 
has methods for dealing with aliases.
The goal here to use a specific certificate (with proper ACLs set for this 
client), and not just the first one that matches.
Looks like it requires a code change to the SSLChannelBuilder



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode

2017-06-26 Thread Colin P. McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063569#comment-16063569
 ] 

Colin P. McCabe commented on KAFKA-3465:


Should we rename the tool or add documentation to make it clear when people 
find this class, that it is only for the old consumer?

> kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
> --
>
> Key: KAFKA-3465
> URL: https://issues.apache.org/jira/browse/KAFKA-3465
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: BrianLing
>
> 1. When we enable mirrorMake to migrate Kafka event from one to other with 
> "new.consumer" mode:
> java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 
> -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC 
> -Djava.awt.headless=true -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dkafka.logs.dir=/kafka/kafka-app-logs 
> -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties
>  -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* 
> -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log 
> --consumer.config ../config/consumer.properties --new.consumer --num.streams 
> 4 --producer.config ../config/producer-slca.properties --whitelist risk.*
> 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and 
> the owner is none.
> bin/kafka-run-class.sh  kafka.tools.ConsumerOffsetChecker --broker-info 
> --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 
> --topic 
> Group   Topic  Pid Offset  logSize
>  Lag Owner
> lvs.slca.mirrormaker   0   418578332   418678347   100015 
>  none
> lvs.slca.mirrormaker  1   418598026   418698338   100312  
> none
> [Root Cause]
> I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka 
> internal to store offset & consumer owner information. 
>   Does it mean we can not use the below command to check new consumer’s 
> lag since current lag formula: lag= logSize – offset 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80
>   
> https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182
>  => offSet Fetch from zookeeper instead of from Kafka



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5413) Log cleaner fails due to large offset in segment file

2017-06-26 Thread Nicholas Ngorok (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063514#comment-16063514
 ] 

Nicholas Ngorok commented on KAFKA-5413:


Thanks [~Kelvinrutt] for getting this solved very quickly! [~junrao] is there a 
timeline/plans for the 0.10.2.2 release?

> Log cleaner fails due to large offset in segment file
> -
>
> Key: KAFKA-5413
> URL: https://issues.apache.org/jira/browse/KAFKA-5413
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.0, 0.10.2.1
> Environment: Ubuntu 14.04 LTS, Oracle Java 8u92, kafka_2.11-0.10.2.0
>Reporter: Nicholas Ngorok
>Assignee: Kelvin Rutt
>Priority: Critical
>  Labels: reliability
> Fix For: 0.11.0.0, 0.10.2.2
>
> Attachments: .index.cleaned, 
> .log, .log.cleaned, 
> .timeindex.cleaned, 002147422683.log, 
> kafka-5413.patch
>
>
> The log cleaner thread in our brokers is failing with the trace below
> {noformat}
> [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: 
> Cleaning segment 0 in log __consumer_offsets-12 (largest timestamp Thu Jun 08 
> 15:48:59 PDT 2017) into 0, retaining deletes. (kafka.log.LogCleaner)
> [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: 
> Cleaning segment 2147343575 in log __consumer_offsets-12 (largest timestamp 
> Thu Jun 08 15:49:06 PDT 2017) into 0, retaining deletes. 
> (kafka.log.LogCleaner)
> [2017-06-08 15:49:54,834] ERROR {kafka-log-cleaner-thread-0} 
> [kafka-log-cleaner-thread-0], Error due to  (kafka.log.LogCleaner)
> java.lang.IllegalArgumentException: requirement failed: largest offset in 
> message set can not be safely converted to relative offset.
> at scala.Predef$.require(Predef.scala:224)
> at kafka.log.LogSegment.append(LogSegment.scala:109)
> at kafka.log.Cleaner.cleanInto(LogCleaner.scala:478)
> at 
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:405)
> at 
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:401)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401)
> at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:363)
> at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:362)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at kafka.log.Cleaner.clean(LogCleaner.scala:362)
> at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241)
> at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> [2017-06-08 15:49:54,835] INFO {kafka-log-cleaner-thread-0} 
> [kafka-log-cleaner-thread-0], Stopped  (kafka.log.LogCleaner)
> {noformat}
> This seems to point at the specific line [here| 
> https://github.com/apache/kafka/blob/0.11.0/core/src/main/scala/kafka/log/LogSegment.scala#L92]
>  in the kafka src where the difference is actually larger than MAXINT as both 
> baseOffset and offset are of type long. It was introduced in this [pr| 
> https://github.com/apache/kafka/pull/2210/files/56d1f8196b77a47b176b7bbd1e4220a3be827631]
> These were the outputs of dumping the first two log segments
> {noformat}
> :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration 
> --files /kafka-logs/__consumer_offsets-12/000
> 0.log
> Dumping /kafka-logs/__consumer_offsets-12/.log
> Starting offset: 0
> offset: 1810054758 position: 0 NoTimestampType: -1 isvalid: true payloadsize: 
> -1 magic: 0 compresscodec: NONE crc: 3127861909 keysize: 34
> :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration 
> --files /kafka-logs/__consumer_offsets-12/000
> 0002147343575.log
> Dumping /kafka-logs/__consumer_offsets-12/002147343575.log
> Starting offset: 2147343575
> offset: 2147539884 position: 0 NoTimestampType: -1 isvalid: true paylo
> adsize: -1 magic: 0 compresscodec: NONE crc: 2282192097 keysize: 34
> {noformat}
> My guess is that since 2147539884 is larger than MAXINT, we are hitting this 
> exception. Was there a specific reason, this check was added in 0.10.2?
> E.g. if the first offset is a key = "key 0" and then we have MAXINT + 1 of 
> "key 1" following, wouldn't we run into this situation whenever the log 
> cleaner runs?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5007) Kafka Replica Fetcher Thread- Resource Leak

2017-06-26 Thread Joseph Aliase (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063456#comment-16063456
 ] 

Joseph Aliase commented on KAFKA-5007:
--

I believe that's the error we are seeing in the log. Let me reproduce the issue 
today. Will confirm

thanks [~huxi_2b]

> Kafka Replica Fetcher Thread- Resource Leak
> ---
>
> Key: KAFKA-5007
> URL: https://issues.apache.org/jira/browse/KAFKA-5007
> Project: Kafka
>  Issue Type: Bug
>  Components: core, network
>Affects Versions: 0.10.0.0, 0.10.1.1, 0.10.2.0
> Environment: Centos 7
> Jave 8
>Reporter: Joseph Aliase
>Priority: Critical
>  Labels: reliability
> Attachments: jstack-kafka.out, jstack-zoo.out, lsofkafka.txt, 
> lsofzookeeper.txt
>
>
> Kafka is running out of open file descriptor when system network interface is 
> done.
> Issue description:
> We have a Kafka Cluster of 5 node running on version 0.10.1.1. The open file 
> descriptor for the account running Kafka is set to 10.
> During an upgrade, network interface went down. Outage continued for 12 hours 
> eventually all the broker crashed with java.io.IOException: Too many open 
> files error.
> We repeated the test in a lower environment and observed that Open Socket 
> count keeps on increasing while the NIC is down.
> We have around 13 topics with max partition size of 120 and number of replica 
> fetcher thread is set to 8.
> Using an internal monitoring tool we observed that Open Socket descriptor   
> for the broker pid continued to increase although NIC was down leading to  
> Open File descriptor error. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3554) Generate actual data with specific compression ratio and add multi-thread support in the ProducerPerformance tool.

2017-06-26 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063452#comment-16063452
 ] 

Chen He commented on KAFKA-3554:


Thank you for the quick reply [~becket_qin]. This work is really valuable. It 
provides us a tool that can exploit kafka system's capacity. For example, we 
can get lowest latency by only use 1 thread, at the same time, by increasing 
thread, we can find what is the maximum throughput for a kafka cluster. 

Only one question, I did applied this patch to latest kafka and comparing 
results with old ProducerPerformance.java file. I found out, if we set ack=all 
with snappy compression, with 100M record(100B each), it does not work as well 
as old PproducerPerformance.java file. 

> Generate actual data with specific compression ratio and add multi-thread 
> support in the ProducerPerformance tool.
> --
>
> Key: KAFKA-3554
> URL: https://issues.apache.org/jira/browse/KAFKA-3554
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.1
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.11.1.0
>
>
> Currently the ProducerPerformance always generate the payload with same 
> bytes. This does not quite well to test the compressed data because the 
> payload is extremely compressible no matter how big the payload is.
> We can make some changes to make it more useful for compressed messages. 
> Currently I am generating the payload containing integer from a given range. 
> By adjusting the range of the integers, we can get different compression 
> ratios. 
> API wise, we can either let user to specify the integer range or the expected 
> compression ratio (we will do some probing to get the corresponding range for 
> the users)
> Besides that, in many cases, it is useful to have multiple producer threads 
> when the producer threads themselves are bottleneck. Admittedly people can 
> run multiple ProducerPerformance to achieve similar result, but it is still 
> different from the real case when people actually use the producer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5473) handle ZK session expiration properly when a new session can't be established

2017-06-26 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063437#comment-16063437
 ] 

Jun Rao commented on KAFKA-5473:


[~prasincs], there are a couple cases to consider. (1) This is the case that 
the broker knows for sure that it's not registered with ZK. In this case, it 
seems failing the broker is better since from the ZK server's perspective, the 
broker is down and failing the broker will make the behavior of the broker 
consistent with what's in ZK server. This is the issue that this particular 
jira is trying to solve. I think we can just wait up to zk.connection.time.ms 
and do a clean shutdown. (2) There is another case that the broker is 
partitioned off from ZK server. In this case, ZK server may have expired the 
session of the broker. However, until the network connection is back, the 
broker doesn't know that its session has expired. In that mode, currently the 
broker doesn't shut down and just wait until the network connection is back.

> handle ZK session expiration properly when a new session can't be established
> -
>
> Key: KAFKA-5473
> URL: https://issues.apache.org/jira/browse/KAFKA-5473
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Prasanna Gautam
>
> In https://issues.apache.org/jira/browse/KAFKA-2405, we change the logic in 
> handling ZK session expiration a bit. If a new ZK session can't be 
> established after session expiration, we just log an error and continue. 
> However, this can leave the broker in a bad state since it's up, but not 
> registered from the controller's perspective. Replicas on this broker may 
> never to be in sync.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5518) General Kafka connector performanc workload

2017-06-26 Thread Chen He (JIRA)
Chen He created KAFKA-5518:
--

 Summary: General Kafka connector performanc workload
 Key: KAFKA-5518
 URL: https://issues.apache.org/jira/browse/KAFKA-5518
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 0.10.2.1
Reporter: Chen He


Sorry, first time to create Kafka JIRA. Just curious whether there is a general 
purpose performance workload for Kafka connector (hdfs, s3, etc). Then, we can 
setup an standard and evaluate the performance for further connectors such as 
swift, etc.

Please feel free to comment or mark as dup if there already is one jira 
tracking this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5517) Support linking to particular configuration parameters

2017-06-26 Thread Tom Bentley (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom Bentley updated KAFKA-5517:
---
Labels: patch-available  (was: )

> Support linking to particular configuration parameters
> --
>
> Key: KAFKA-5517
> URL: https://issues.apache.org/jira/browse/KAFKA-5517
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Tom Bentley
>Assignee: Tom Bentley
>Priority: Minor
>  Labels: patch-available
>
> Currently the configuration parameters are documented long tables, and it's 
> only possible to link to the heading before a particular table. When 
> discussing configuration parameters on forums it would be helpful to be able 
> to link to the particular parameter under discussion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5517) Support linking to particular configuration parameters

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063294#comment-16063294
 ] 

ASF GitHub Bot commented on KAFKA-5517:
---

GitHub user tombentley opened a pull request:

https://github.com/apache/kafka/pull/3436

KAFKA-5517: Add id to config HTML tables to allow linking



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tombentley/kafka KAFKA-5517

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3436.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3436






> Support linking to particular configuration parameters
> --
>
> Key: KAFKA-5517
> URL: https://issues.apache.org/jira/browse/KAFKA-5517
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Tom Bentley
>Assignee: Tom Bentley
>Priority: Minor
>
> Currently the configuration parameters are documented long tables, and it's 
> only possible to link to the heading before a particular table. When 
> discussing configuration parameters on forums it would be helpful to be able 
> to link to the particular parameter under discussion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5517) Support linking to particular configuration parameters

2017-06-26 Thread Tom Bentley (JIRA)
Tom Bentley created KAFKA-5517:
--

 Summary: Support linking to particular configuration parameters
 Key: KAFKA-5517
 URL: https://issues.apache.org/jira/browse/KAFKA-5517
 Project: Kafka
  Issue Type: Improvement
  Components: documentation
Reporter: Tom Bentley
Assignee: Tom Bentley
Priority: Minor


Currently the configuration parameters are documented long tables, and it's 
only possible to link to the heading before a particular table. When discussing 
configuration parameters on forums it would be helpful to be able to link to 
the particular parameter under discussion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5413) Log cleaner fails due to large offset in segment file

2017-06-26 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063228#comment-16063228
 ] 

Jun Rao commented on KAFKA-5413:


Merged https://github.com/apache/kafka/pull/3397 to 0.10.2.

> Log cleaner fails due to large offset in segment file
> -
>
> Key: KAFKA-5413
> URL: https://issues.apache.org/jira/browse/KAFKA-5413
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.0, 0.10.2.1
> Environment: Ubuntu 14.04 LTS, Oracle Java 8u92, kafka_2.11-0.10.2.0
>Reporter: Nicholas Ngorok
>Assignee: Kelvin Rutt
>Priority: Critical
>  Labels: reliability
> Fix For: 0.11.0.0, 0.10.2.2
>
> Attachments: .index.cleaned, 
> .log, .log.cleaned, 
> .timeindex.cleaned, 002147422683.log, 
> kafka-5413.patch
>
>
> The log cleaner thread in our brokers is failing with the trace below
> {noformat}
> [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: 
> Cleaning segment 0 in log __consumer_offsets-12 (largest timestamp Thu Jun 08 
> 15:48:59 PDT 2017) into 0, retaining deletes. (kafka.log.LogCleaner)
> [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: 
> Cleaning segment 2147343575 in log __consumer_offsets-12 (largest timestamp 
> Thu Jun 08 15:49:06 PDT 2017) into 0, retaining deletes. 
> (kafka.log.LogCleaner)
> [2017-06-08 15:49:54,834] ERROR {kafka-log-cleaner-thread-0} 
> [kafka-log-cleaner-thread-0], Error due to  (kafka.log.LogCleaner)
> java.lang.IllegalArgumentException: requirement failed: largest offset in 
> message set can not be safely converted to relative offset.
> at scala.Predef$.require(Predef.scala:224)
> at kafka.log.LogSegment.append(LogSegment.scala:109)
> at kafka.log.Cleaner.cleanInto(LogCleaner.scala:478)
> at 
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:405)
> at 
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:401)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401)
> at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:363)
> at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:362)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at kafka.log.Cleaner.clean(LogCleaner.scala:362)
> at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241)
> at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> [2017-06-08 15:49:54,835] INFO {kafka-log-cleaner-thread-0} 
> [kafka-log-cleaner-thread-0], Stopped  (kafka.log.LogCleaner)
> {noformat}
> This seems to point at the specific line [here| 
> https://github.com/apache/kafka/blob/0.11.0/core/src/main/scala/kafka/log/LogSegment.scala#L92]
>  in the kafka src where the difference is actually larger than MAXINT as both 
> baseOffset and offset are of type long. It was introduced in this [pr| 
> https://github.com/apache/kafka/pull/2210/files/56d1f8196b77a47b176b7bbd1e4220a3be827631]
> These were the outputs of dumping the first two log segments
> {noformat}
> :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration 
> --files /kafka-logs/__consumer_offsets-12/000
> 0.log
> Dumping /kafka-logs/__consumer_offsets-12/.log
> Starting offset: 0
> offset: 1810054758 position: 0 NoTimestampType: -1 isvalid: true payloadsize: 
> -1 magic: 0 compresscodec: NONE crc: 3127861909 keysize: 34
> :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration 
> --files /kafka-logs/__consumer_offsets-12/000
> 0002147343575.log
> Dumping /kafka-logs/__consumer_offsets-12/002147343575.log
> Starting offset: 2147343575
> offset: 2147539884 position: 0 NoTimestampType: -1 isvalid: true paylo
> adsize: -1 magic: 0 compresscodec: NONE crc: 2282192097 keysize: 34
> {noformat}
> My guess is that since 2147539884 is larger than MAXINT, we are hitting this 
> exception. Was there a specific reason, this check was added in 0.10.2?
> E.g. if the first offset is a key = "key 0" and then we have MAXINT + 1 of 
> "key 1" following, wouldn't we run into this situation whenever the log 
> cleaner runs?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4388) Connect key and value converters are listed without default values

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063107#comment-16063107
 ] 

ASF GitHub Bot commented on KAFKA-4388:
---

GitHub user evis opened a pull request:

https://github.com/apache/kafka/pull/3435

KAFKA-4388 Recommended values for converters from plugins

Questions to reviewers:
1. Should we cache `converterRecommenders.validValues()`, 
`SinkConnectorConfig.configDef()` and `SourceConnectorConfig.configDef()` 
results?
2. What is appropriate place for testing new 
`ConnectorConfig.configDef(plugins)` functionality?

cc @ewencp 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/evis/kafka converters_values

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3435.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3435


commit 069a1ba832f844c20224598c622fa19576b0ba61
Author: Evgeny Veretennikov 
Date:   2017-06-26T13:44:40Z

KAFKA-4388 Recommended values for converters from plugins

ConnectorConfig.configDef() takes Plugins parameter now. List of
recommended values for converters is taken from plugins.converters()




> Connect key and value converters are listed without default values
> --
>
> Key: KAFKA-4388
> URL: https://issues.apache.org/jira/browse/KAFKA-4388
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Evgeny Veretennikov
>Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> KIP-75 added per connector converters. This exposes the settings on a 
> per-connector basis via the validation API. However, the way this is 
> specified for each connector is via a config value with no default value. 
> This means the validation API implies there is no setting unless you provide 
> one.
> It would be much better to include the default value extracted from the 
> WorkerConfig instead so it's clear you shouldn't need to override the default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5516) Formatting verifiable producer/consumer output in a similar fashion

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063049#comment-16063049
 ] 

ASF GitHub Bot commented on KAFKA-5516:
---

GitHub user ppatierno opened a pull request:

https://github.com/apache/kafka/pull/3434

KAFKA-5516: Formatting verifiable producer/consumer output in a similar 
fashion



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ppatierno/kafka verifiable-consumer-producer

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3434.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3434


commit 6e6d728dce83ea689061a196e1b6811b447d4db7
Author: ppatierno 
Date:   2017-06-26T12:07:08Z

Modified JSON order attributes in a more readable fashion

commit 9235aadfccf446415ffbfd5d90c8d4faeddecc08
Author: ppatierno 
Date:   2017-06-26T12:56:07Z

Fixed documentation about old request.required.acks producer parameter
Modified JSON order attributes in a more readable fashion
Refactoring on verifiable producer to be like the verifiable consumer




> Formatting verifiable producer/consumer output in a similar fashion
> ---
>
> Key: KAFKA-5516
> URL: https://issues.apache.org/jira/browse/KAFKA-5516
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Paolo Patierno
>Assignee: Paolo Patierno
>Priority: Trivial
>
> Hi,
> following the proposal to have verifiable producer/consumer providing a very 
> similar output where the "timestamp" is always the first column followed by 
> "name" event and then all the specific data for such event.
> It includes a verifiable producer refactoring for having that in the same way 
> as verifiable consumer.
> Thanks,
> Paolo



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5516) Formatting verifiable producer/consumer output in a similar fashion

2017-06-26 Thread Paolo Patierno (JIRA)
Paolo Patierno created KAFKA-5516:
-

 Summary: Formatting verifiable producer/consumer output in a 
similar fashion
 Key: KAFKA-5516
 URL: https://issues.apache.org/jira/browse/KAFKA-5516
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Paolo Patierno
Assignee: Paolo Patierno
Priority: Trivial


Hi,
following the proposal to have verifiable producer/consumer providing a very 
similar output where the "timestamp" is always the first column followed by 
"name" event and then all the specific data for such event.
It includes a verifiable producer refactoring for having that in the same way 
as verifiable consumer.

Thanks,
Paolo



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5515) Consider removing date formatting from Segments class

2017-06-26 Thread Bill Bejeck (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck updated KAFKA-5515:
---
Description: Currently the {{Segments}} class uses a date when calculating 
the segment id and uses {{SimpleDateFormat}} for formatting the segment id.  
However this is a high volume code path and creating a new {{SimpleDateFormat}} 
and formatting each segment id is expensive.  We should look into removing the 
date from the segment id or at a minimum use a faster alternative to 
{{SimpleDateFormat}}.  We should also consider keeping a lookup of existing 
segments to avoid as many string operations as possible.  (was: Currently the 
{{Segments}} class uses a date when calculating the segment id and uses 
{{SimpleDateFormat}} for formatting the segment id.  However this is a high 
volume code path and creating a new {{SimpleDateFormat}} for each segment id is 
expensive.  We should look into removing the date from the segment id or at a 
minimum use a faster alternative to {{SimpleDateFormat}}.  We should also 
consider keeping a lookup of existing segments to avoid as many string 
operations as possible.)

> Consider removing date formatting from Segments class
> -
>
> Key: KAFKA-5515
> URL: https://issues.apache.org/jira/browse/KAFKA-5515
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Bill Bejeck
>  Labels: performance
>
> Currently the {{Segments}} class uses a date when calculating the segment id 
> and uses {{SimpleDateFormat}} for formatting the segment id.  However this is 
> a high volume code path and creating a new {{SimpleDateFormat}} and 
> formatting each segment id is expensive.  We should look into removing the 
> date from the segment id or at a minimum use a faster alternative to 
> {{SimpleDateFormat}}.  We should also consider keeping a lookup of existing 
> segments to avoid as many string operations as possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5515) Consider removing date formatting from Segments class

2017-06-26 Thread Bill Bejeck (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck updated KAFKA-5515:
---
Description: Currently the {{Segments}} class uses a date when calculating 
the segment id and uses {{SimpleDateFormat}} for formatting the segment id.  
However this is a high volume code path and creating a new {{SimpleDateFormat}} 
for each segment id is expensive.  We should look into removing the date from 
the segment id or at a minimum use a faster alternative to 
{{SimpleDateFormat}}.  We should also consider keeping a lookup of existing 
segments to avoid as many string operations as possible.  (was: Currently the 
{{Segments}} class uses a date when calculating the segment id and uses 
{{SimpleDateFormat}} for formatting the segment id.  However this is a high 
volume code path and creating a new {{SimpleDateFormat}} for each segment id is 
expensive.  We should look into removing the date from the segment id or at a 
minimum use a faster alternative to {{SimpleDateFormat}} )

> Consider removing date formatting from Segments class
> -
>
> Key: KAFKA-5515
> URL: https://issues.apache.org/jira/browse/KAFKA-5515
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Bill Bejeck
>  Labels: performance
>
> Currently the {{Segments}} class uses a date when calculating the segment id 
> and uses {{SimpleDateFormat}} for formatting the segment id.  However this is 
> a high volume code path and creating a new {{SimpleDateFormat}} for each 
> segment id is expensive.  We should look into removing the date from the 
> segment id or at a minimum use a faster alternative to {{SimpleDateFormat}}.  
> We should also consider keeping a lookup of existing segments to avoid as 
> many string operations as possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5514) KafkaConsumer ignores default values in Properties object because of incorrect use of Properties object.

2017-06-26 Thread Geert Schuring (JIRA)
Geert Schuring created KAFKA-5514:
-

 Summary: KafkaConsumer ignores default values in Properties object 
because of incorrect use of Properties object.
 Key: KAFKA-5514
 URL: https://issues.apache.org/jira/browse/KAFKA-5514
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 0.10.2.1
Reporter: Geert Schuring


When setting default values in a Properties object the KafkaConsumer ignores 
these values because the Properties object is being treated as a Map. The 
ConsumerConfig object uses the putAll method to copy properties from the 
incoming object to its local copy. (See 
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java#L471)

This is incorrect because it only copies the explicit properties and ignores 
the default values also present in the properties object. (Also see: 
https://stackoverflow.com/questions/2004833/how-to-merge-two-java-util-properties-objects)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5372) Unexpected state transition Dead to PendingShutdown

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062996#comment-16062996
 ] 

ASF GitHub Bot commented on KAFKA-5372:
---

GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/3432

KAFKA-5372: fixes to state transitions



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka KAFKA-5372-state-transitions

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3432.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3432


commit 5f57558ef293351d9f5db11edb62089367e39b76
Author: Eno Thereska 
Date:   2017-06-26T11:04:20Z

Checkpoint

commit ac372b196998052a024aac47af64dbd803a65733
Author: Eno Thereska 
Date:   2017-06-26T12:11:26Z

Some fixes




> Unexpected state transition Dead to PendingShutdown
> ---
>
> Key: KAFKA-5372
> URL: https://issues.apache.org/jira/browse/KAFKA-5372
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Jason Gustafson
>Assignee: Eno Thereska
> Fix For: 0.11.1.0
>
>
> I often see this running integration tests:
> {code}
> [2017-06-02 15:36:03,411] WARN stream-thread 
> [appId-1-c382ef0a-adbd-422b-9717-9b2bc52b55eb-StreamThread-13] Unexpected 
> state transition from DEAD to PENDING_SHUTDOWN. 
> (org.apache.kafka.streams.processor.internals.StreamThread:976)
> {code}
> Maybe a race condition on shutdown or something?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle

2017-06-26 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062859#comment-16062859
 ] 

Ismael Juma commented on KAFKA-5512:


Nice catch. Are you intending to submit a PR with your fixes?

> KafkaConsumer: High memory allocation rate when idle
> 
>
> Key: KAFKA-5512
> URL: https://issues.apache.org/jira/browse/KAFKA-5512
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.1.1
>Reporter: Stephane Roset
>  Labels: performance
> Fix For: 0.11.0.1
>
>
> Hi,
> We noticed in our application that the memory allocation rate increased 
> significantly when we have no Kafka messages to consume. We isolated the 
> issue by using a JVM that simply runs 128 Kafka consumers. These consumers 
> consume 128 partitions (so each consumer consumes one partition). The 
> partitions are empty and no message has been sent during the test. The 
> consumers were configured with default values (session.timeout.ms=3, 
> fetch.max.wait.ms=500, receive.buffer.bytes=65536, 
> heartbeat.interval.ms=3000, max.poll.interval.ms=30, 
> max.poll.records=500). The Kafka cluster was made of 3 brokers. Within this 
> context, the allocation rate was about 55 MiB/s. This high allocation rate 
> generates a lot of GC activity (to garbage the young heap) and was an issue 
> for our project.
> We profiled the JVM with JProfiler. We noticed that there were a huge 
> quantity of ArrayList$Itr in memory. These collections were mainly 
> instantiated by the methods handleCompletedReceives, handleCompletedSends, 
> handleConnecions and handleDisconnections of the class NetWorkClient. We also 
> noticed that we had a lot of calls to the method pollOnce of the class 
> KafkaConsumer. 
> So we decided to run only one consumer and to profile the calls to the method 
> pollOnce. We noticed that regularly a huge number of calls is made to this 
> method, up to 268000 calls within 100ms. The pollOnce method calls the 
> NetworkClient.handle* methods. These methods iterate on collections (even if 
> they are empty), so that explains the huge number of iterators in memory.
> The large number of calls is related to the heartbeat mechanism. The pollOnce 
> method calculates the poll timeout; if a heartbeat needs to be done, the 
> timeout will be set to 0. The problem is that the heartbeat thread checks 
> every 100 ms (default value of retry.backoff.ms) if a heartbeat should be 
> sent, so the KafkaConsumer will call the poll method in a loop without 
> timeout until the heartbeat thread awakes. For example: the heartbeat thread 
> just started to wait and will awake in 99ms. So during 99ms, the 
> KafkaConsumer will call in a loop the pollOnce method and will use a timeout 
> of 0. That explains how we can have 268000 calls within 100ms. 
> The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so 
> I think the Kafka consumer should awake the heartbeat thread with a notify 
> when needed.
> We made two quick fixes to solve this issue:
>   - In NetworkClient.handle*(), we don't iterate on collections if they are 
> empty (to avoid unnecessary iterators instantiations).
>   - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify 
> the heartbeat thread to awake it (dirty fix because we don't handle the 
> autocommit case).
> With these 2 quick fixes and 128 consumers, the allocation rate drops down 
> from 55 MiB/s to 4 MiB/s.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle

2017-06-26 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5512:
---
Labels: performance  (was: )

> KafkaConsumer: High memory allocation rate when idle
> 
>
> Key: KAFKA-5512
> URL: https://issues.apache.org/jira/browse/KAFKA-5512
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.1.1
>Reporter: Stephane Roset
>  Labels: performance
> Fix For: 0.11.0.1
>
>
> Hi,
> We noticed in our application that the memory allocation rate increased 
> significantly when we have no Kafka messages to consume. We isolated the 
> issue by using a JVM that simply runs 128 Kafka consumers. These consumers 
> consume 128 partitions (so each consumer consumes one partition). The 
> partitions are empty and no message has been sent during the test. The 
> consumers were configured with default values (session.timeout.ms=3, 
> fetch.max.wait.ms=500, receive.buffer.bytes=65536, 
> heartbeat.interval.ms=3000, max.poll.interval.ms=30, 
> max.poll.records=500). The Kafka cluster was made of 3 brokers. Within this 
> context, the allocation rate was about 55 MiB/s. This high allocation rate 
> generates a lot of GC activity (to garbage the young heap) and was an issue 
> for our project.
> We profiled the JVM with JProfiler. We noticed that there were a huge 
> quantity of ArrayList$Itr in memory. These collections were mainly 
> instantiated by the methods handleCompletedReceives, handleCompletedSends, 
> handleConnecions and handleDisconnections of the class NetWorkClient. We also 
> noticed that we had a lot of calls to the method pollOnce of the class 
> KafkaConsumer. 
> So we decided to run only one consumer and to profile the calls to the method 
> pollOnce. We noticed that regularly a huge number of calls is made to this 
> method, up to 268000 calls within 100ms. The pollOnce method calls the 
> NetworkClient.handle* methods. These methods iterate on collections (even if 
> they are empty), so that explains the huge number of iterators in memory.
> The large number of calls is related to the heartbeat mechanism. The pollOnce 
> method calculates the poll timeout; if a heartbeat needs to be done, the 
> timeout will be set to 0. The problem is that the heartbeat thread checks 
> every 100 ms (default value of retry.backoff.ms) if a heartbeat should be 
> sent, so the KafkaConsumer will call the poll method in a loop without 
> timeout until the heartbeat thread awakes. For example: the heartbeat thread 
> just started to wait and will awake in 99ms. So during 99ms, the 
> KafkaConsumer will call in a loop the pollOnce method and will use a timeout 
> of 0. That explains how we can have 268000 calls within 100ms. 
> The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so 
> I think the Kafka consumer should awake the heartbeat thread with a notify 
> when needed.
> We made two quick fixes to solve this issue:
>   - In NetworkClient.handle*(), we don't iterate on collections if they are 
> empty (to avoid unnecessary iterators instantiations).
>   - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify 
> the heartbeat thread to awake it (dirty fix because we don't handle the 
> autocommit case).
> With these 2 quick fixes and 128 consumers, the allocation rate drops down 
> from 55 MiB/s to 4 MiB/s.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle

2017-06-26 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5512:
---
Fix Version/s: 0.11.0.1

> KafkaConsumer: High memory allocation rate when idle
> 
>
> Key: KAFKA-5512
> URL: https://issues.apache.org/jira/browse/KAFKA-5512
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.1.1
>Reporter: Stephane Roset
>  Labels: performance
> Fix For: 0.11.0.1
>
>
> Hi,
> We noticed in our application that the memory allocation rate increased 
> significantly when we have no Kafka messages to consume. We isolated the 
> issue by using a JVM that simply runs 128 Kafka consumers. These consumers 
> consume 128 partitions (so each consumer consumes one partition). The 
> partitions are empty and no message has been sent during the test. The 
> consumers were configured with default values (session.timeout.ms=3, 
> fetch.max.wait.ms=500, receive.buffer.bytes=65536, 
> heartbeat.interval.ms=3000, max.poll.interval.ms=30, 
> max.poll.records=500). The Kafka cluster was made of 3 brokers. Within this 
> context, the allocation rate was about 55 MiB/s. This high allocation rate 
> generates a lot of GC activity (to garbage the young heap) and was an issue 
> for our project.
> We profiled the JVM with JProfiler. We noticed that there were a huge 
> quantity of ArrayList$Itr in memory. These collections were mainly 
> instantiated by the methods handleCompletedReceives, handleCompletedSends, 
> handleConnecions and handleDisconnections of the class NetWorkClient. We also 
> noticed that we had a lot of calls to the method pollOnce of the class 
> KafkaConsumer. 
> So we decided to run only one consumer and to profile the calls to the method 
> pollOnce. We noticed that regularly a huge number of calls is made to this 
> method, up to 268000 calls within 100ms. The pollOnce method calls the 
> NetworkClient.handle* methods. These methods iterate on collections (even if 
> they are empty), so that explains the huge number of iterators in memory.
> The large number of calls is related to the heartbeat mechanism. The pollOnce 
> method calculates the poll timeout; if a heartbeat needs to be done, the 
> timeout will be set to 0. The problem is that the heartbeat thread checks 
> every 100 ms (default value of retry.backoff.ms) if a heartbeat should be 
> sent, so the KafkaConsumer will call the poll method in a loop without 
> timeout until the heartbeat thread awakes. For example: the heartbeat thread 
> just started to wait and will awake in 99ms. So during 99ms, the 
> KafkaConsumer will call in a loop the pollOnce method and will use a timeout 
> of 0. That explains how we can have 268000 calls within 100ms. 
> The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so 
> I think the Kafka consumer should awake the heartbeat thread with a notify 
> when needed.
> We made two quick fixes to solve this issue:
>   - In NetworkClient.handle*(), we don't iterate on collections if they are 
> empty (to avoid unnecessary iterators instantiations).
>   - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify 
> the heartbeat thread to awake it (dirty fix because we don't handle the 
> autocommit case).
> With these 2 quick fixes and 128 consumers, the allocation rate drops down 
> from 55 MiB/s to 4 MiB/s.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle

2017-06-26 Thread Stephane Roset (JIRA)
Stephane Roset created KAFKA-5512:
-

 Summary: KafkaConsumer: High memory allocation rate when idle
 Key: KAFKA-5512
 URL: https://issues.apache.org/jira/browse/KAFKA-5512
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 0.10.1.1
Reporter: Stephane Roset


Hi,

We noticed in our application that the memory allocation rate increased 
significantly when we have no Kafka messages to consume. We isolated the issue 
by using a JVM that simply runs 128 Kafka consumers. These consumers consume 
128 partitions (so each consumer consumes one partition). The partitions are 
empty and no message has been sent during the test. The consumers were 
configured with default values (session.timeout.ms=3, 
fetch.max.wait.ms=500, receive.buffer.bytes=65536, heartbeat.interval.ms=3000, 
max.poll.interval.ms=30, max.poll.records=500). The Kafka cluster was made 
of 3 brokers. Within this context, the allocation rate was about 55 MiB/s. This 
high allocation rate generates a lot of GC activity (to garbage the young heap) 
and was an issue for our project.

We profiled the JVM with JProfiler. We noticed that there were a huge quantity 
of ArrayList$Itr in memory. These collections were mainly instantiated by the 
methods handleCompletedReceives, handleCompletedSends, handleConnecions and 
handleDisconnections of the class NetWorkClient. We also noticed that we had a 
lot of calls to the method pollOnce of the class KafkaConsumer. 

So we decided to run only one consumer and to profile the calls to the method 
pollOnce. We noticed that regularly a huge number of calls is made to this 
method, up to 268000 calls within 100ms. The pollOnce method calls the 
NetworkClient.handle* methods. These methods iterate on collections (even if 
they are empty), so that explains the huge number of iterators in memory.

The large number of calls is related to the heartbeat mechanism. The pollOnce 
method calculates the poll timeout; if a heartbeat needs to be done, the 
timeout will be set to 0. The problem is that the heartbeat thread checks every 
100 ms (default value of retry.backoff.ms) if a heartbeat should be sent, so 
the KafkaConsumer will call the poll method in a loop without timeout until the 
heartbeat thread awakes. For example: the heartbeat thread just started to wait 
and will awake in 99ms. So during 99ms, the KafkaConsumer will call in a loop 
the pollOnce method and will use a timeout of 0. That explains how we can have 
268000 calls within 100ms. 

The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so I 
think the Kafka consumer should awake the heartbeat thread with a notify when 
needed.

We made two quick fixes to solve this issue:
  - In NetworkClient.handle*(), we don't iterate on collections if they are 
empty (to avoid unnecessary iterators instantiations).
  - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify 
the heartbeat thread to awake it (dirty fix because we don't handle the 
autocommit case).

With these 2 quick fixes and 128 consumers, the allocation rate drops down from 
55 MiB/s to 4 MiB/s.








--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-4388) Connect key and value converters are listed without default values

2017-06-26 Thread Evgeny Veretennikov (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evgeny Veretennikov reassigned KAFKA-4388:
--

Assignee: Evgeny Veretennikov

> Connect key and value converters are listed without default values
> --
>
> Key: KAFKA-4388
> URL: https://issues.apache.org/jira/browse/KAFKA-4388
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Evgeny Veretennikov
>Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> KIP-75 added per connector converters. This exposes the settings on a 
> per-connector basis via the validation API. However, the way this is 
> specified for each connector is via a config value with no default value. 
> This means the validation API implies there is no setting unless you provide 
> one.
> It would be much better to include the default value extracted from the 
> WorkerConfig instead so it's clear you shouldn't need to override the default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4750) KeyValueIterator returns null values

2017-06-26 Thread Evgeny Veretennikov (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062575#comment-16062575
 ] 

Evgeny Veretennikov commented on KAFKA-4750:


If we invoke {{store.put(key, value)}}, and serdes returns null for value, 
shouldn't we throw NPE instead of deleting key from store? It seems 
counter-intuitive, that value here can be null, since null indicates, that no 
value found:

{code:java}
Object value = ???; // some value serialized to null by serde
store.put(key, value);
Object value = store.get(key); // returns null, that seems like no value found
{code}

Throwing exception inside {{put()}} prevents from potential data loss.

By the way, why does {{KeyValueStore.put()}} method allows value to be null?

> KeyValueIterator returns null values
> 
>
> Key: KAFKA-4750
> URL: https://issues.apache.org/jira/browse/KAFKA-4750
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1
>Reporter: Michal Borowiecki
>Assignee: Evgeny Veretennikov
>  Labels: newbie
> Attachments: DeleteTest.java
>
>
> The API for ReadOnlyKeyValueStore.range method promises the returned iterator 
> will not return null values. However, after upgrading from 0.10.0.0 to 
> 0.10.1.1 we found null values are returned causing NPEs on our side.
> I found this happens after removing entries from the store and I found 
> resemblance to SAMZA-94 defect. The problem seems to be as it was there, when 
> deleting entries and having a serializer that does not return null when null 
> is passed in, the state store doesn't actually delete that key/value pair but 
> the iterator will return null value for that key.
> When I modified our serilizer to return null when null is passed in, the 
> problem went away. However, I believe this should be fixed in kafka streams, 
> perhaps with a similar approach as SAMZA-94.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)