[
https://issues.apache.org/jira/browse/SPARK-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265533#comment-14265533
]
Tathagata Das commented on SPARK-4960:
--------------------------------------
The reason we have I am suggesting the limited solution is that there are
usecases where even T => Iterator[T[ helps. People have asked me before whether
the data received from a source can be filtered even before it has been
inserted to reduce memory usage. Others have also asked if they can do very low
latency stuff like pushing received data out to some other store immediately
for greater reliability. This simplified interceptor pattern can solve those.
However, yes, it does not solve [[email protected]] requirement. That should
best be solved using a new Receiver and InputDStream.
This limited solution can be implemented without another receiver. The
interceptor function, if set, can be applied by the BlockGenerator to every
records it is getting. And since we want everyone to use the BlockGenerator,
all receivers will be able to take advantage of this interceptor.
> Interceptor pattern in receivers
> --------------------------------
>
> Key: SPARK-4960
> URL: https://issues.apache.org/jira/browse/SPARK-4960
> Project: Spark
> Issue Type: New Feature
> Components: Streaming
> Reporter: Tathagata Das
>
> Sometimes it is good to intercept a message received through a receiver and
> modify / do something with the message before it is stored into Spark. This
> is often referred to as the interceptor pattern. There should be general way
> to specify an interceptor function that gets applied to all receivers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]