[
https://issues.apache.org/jira/browse/SPARK-12203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15046679#comment-15046679
]
Sean Owen commented on SPARK-12203:
-----------------------------------
Is this really worth a third implementation? it seems to have downsides of both
current implementation for not much gain.
> Add KafkaDirectInputDStream that directly pulls messages from Kafka Brokers
> using receivers
> -------------------------------------------------------------------------------------------
>
> Key: SPARK-12203
> URL: https://issues.apache.org/jira/browse/SPARK-12203
> Project: Spark
> Issue Type: New Feature
> Components: Streaming
> Reporter: Liang-Chi Hsieh
>
> Currently, we have DirectKafkaInputDStream, which directly pulls messages
> from Kafka Brokers without any receivers, and KafkaInputDStream, which pulls
> messages from a Kafka Broker using receiver with zookeeper.
> As we observed, because DirectKafkaInputDStream retrieves messages from Kafka
> after each batch finishes, it posts a latency compared with KafkaInputDStream
> that continues to pull messages during each batch window.
> So we try to add KafkaDirectInputDStream that directly pulls messages from
> Kafka Brokers as DirectKafkaInputDStream, but it uses receivers as
> KafkaInputDStream and pulls messages during each batch window.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]