[
https://issues.apache.org/jira/browse/SPARK-28415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun resolved SPARK-28415.
-----------------------------------
Resolution: Invalid
According to the decision on https://github.com/apache/spark/pull/27022, I
close this JIRA issue as `Invalid`. Please feel free to reopen this if there is
an update.
> Add messageHandler to Kafka 10 direct stream API
> ------------------------------------------------
>
> Key: SPARK-28415
> URL: https://issues.apache.org/jira/browse/SPARK-28415
> Project: Spark
> Issue Type: New Feature
> Components: DStreams
> Affects Versions: 3.0.0
> Reporter: Michael Spector
> Priority: Major
>
> Lack of messageHandler parameter to KafkaUtils.createDirectStrem(...) in new
> Kafka API is what prevents us from upgrading our processes to use it, and
> here's why:
> # messageHandler() allowed parsing / filtering / projecting huge JSON files
> at an early stage (only a small subset of JSON fields is required for a
> process), without this current cluster configuration doesn't keep up with the
> traffic.
> # Transforming Kafka events right after a stream is created prevents from
> using HasOffsetRanges interface later. This means that whole message must be
> propagated to the end of a pipeline, which is very ineffective.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]