[jira] [Updated] (STORM-3102) Storm Kafka Client performance issues with Kafka Client v1.0.0

2018-06-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated STORM-3102:
--
Labels: pull-request-available  (was: )

> Storm Kafka Client performance issues with Kafka Client v1.0.0
> --
>
> Key: STORM-3102
> URL: https://issues.apache.org/jira/browse/STORM-3102
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-kafka-client
>Affects Versions: 2.0.0, 1.0.6, 1.1.3, 1.2.2
>Reporter: Andy Seidel
>Assignee: Andy Seidel
>Priority: Major
>  Labels: pull-request-available
>
> Recently I upgraded our storm topology to use the storm-kafka-client instead 
> of storm-kafka.  After the upgrade in our production environment we saw a 
> significant (2x) reduction in our processing throughput.
> We process ~2 kafka messages per second, on a 10 machine kafka 1.0.0 
> server cluster.
> After some investigation, it looks like the issue only occurs when using 
> kafka clients 0.11 or newer.
> In kafka 0.11, the kafka consumer method commited always blocks to make an 
> external call o get the last commited offsets
> [https://github.com/apache/kafka/blob/e18335dd953107a61d89451932de33d33c0fd207/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L1326-L1351]
> In kafka 0.10.2 the kafka consumer only made the blocking remote call if the 
> partition is not assigned to the consumer
> [https://github.com/apache/kafka/blob/695596977c7f293513f255e07f5a4b0240a7595c/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L1274-L1311]
>  
> The impact of this is to require every tuple to make blocking remote calls 
> before being emitted.  
> [https://github.com/apache/storm/blob/2dc3d53a11aa3fea62190d1e44fa8b621466/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java#L464-L473]
> Removing this check returns performance to expected levels.
> Looking through the storm-kafka-client code, it is not clear to me the impact 
> of ignoring the check.  In our case we want at least once processing, but for 
> other processing gurantees the call to kafkaConsumer.commited(tp) is not 
> needed, as the value is only looked at if the processing mode is at least 
> once.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (STORM-3102) Storm Kafka Client performance issues with Kafka Client v1.0.0

2018-06-13 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/STORM-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stig Rohde Døssing updated STORM-3102:
--
Affects Version/s: 2.0.0

> Storm Kafka Client performance issues with Kafka Client v1.0.0
> --
>
> Key: STORM-3102
> URL: https://issues.apache.org/jira/browse/STORM-3102
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-kafka-client
>Affects Versions: 2.0.0, 1.0.6, 1.1.3, 1.2.2
>Reporter: Andy Seidel
>Priority: Major
>
> Recently I upgraded our storm topology to use the storm-kafka-client instead 
> of storm-kafka.  After the upgrade in our production environment we saw a 
> significant (2x) reduction in our processing throughput.
> We process ~2 kafka messages per second, on a 10 machine kafka 1.0.0 
> server cluster.
> After some investigation, it looks like the issue only occurs when using 
> kafka clients 0.11 or newer.
> In kafka 0.11, the kafka consumer method commited always blocks to make an 
> external call o get the last commited offsets
> [https://github.com/apache/kafka/blob/e18335dd953107a61d89451932de33d33c0fd207/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L1326-L1351]
> In kafka 0.10.2 the kafka consumer only made the blocking remote call if the 
> partition is not assigned to the consumer
> [https://github.com/apache/kafka/blob/695596977c7f293513f255e07f5a4b0240a7595c/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L1274-L1311]
>  
> The impact of this is to require every tuple to make blocking remote calls 
> before being emitted.  
> [https://github.com/apache/storm/blob/2dc3d53a11aa3fea62190d1e44fa8b621466/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java#L464-L473]
> Removing this check returns performance to expected levels.
> Looking through the storm-kafka-client code, it is not clear to me the impact 
> of ignoring the check.  In our case we want at least once processing, but for 
> other processing gurantees the call to kafkaConsumer.commited(tp) is not 
> needed, as the value is only looked at if the processing mode is at least 
> once.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (STORM-3102) Storm Kafka Client performance issues with Kafka Client v1.0.0

2018-06-13 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/STORM-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stig Rohde Døssing updated STORM-3102:
--
Affects Version/s: 1.2.2

> Storm Kafka Client performance issues with Kafka Client v1.0.0
> --
>
> Key: STORM-3102
> URL: https://issues.apache.org/jira/browse/STORM-3102
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-kafka-client
>Affects Versions: 2.0.0, 1.0.6, 1.1.3, 1.2.2
>Reporter: Andy Seidel
>Priority: Major
>
> Recently I upgraded our storm topology to use the storm-kafka-client instead 
> of storm-kafka.  After the upgrade in our production environment we saw a 
> significant (2x) reduction in our processing throughput.
> We process ~2 kafka messages per second, on a 10 machine kafka 1.0.0 
> server cluster.
> After some investigation, it looks like the issue only occurs when using 
> kafka clients 0.11 or newer.
> In kafka 0.11, the kafka consumer method commited always blocks to make an 
> external call o get the last commited offsets
> [https://github.com/apache/kafka/blob/e18335dd953107a61d89451932de33d33c0fd207/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L1326-L1351]
> In kafka 0.10.2 the kafka consumer only made the blocking remote call if the 
> partition is not assigned to the consumer
> [https://github.com/apache/kafka/blob/695596977c7f293513f255e07f5a4b0240a7595c/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L1274-L1311]
>  
> The impact of this is to require every tuple to make blocking remote calls 
> before being emitted.  
> [https://github.com/apache/storm/blob/2dc3d53a11aa3fea62190d1e44fa8b621466/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java#L464-L473]
> Removing this check returns performance to expected levels.
> Looking through the storm-kafka-client code, it is not clear to me the impact 
> of ignoring the check.  In our case we want at least once processing, but for 
> other processing gurantees the call to kafkaConsumer.commited(tp) is not 
> needed, as the value is only looked at if the processing mode is at least 
> once.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)