[ 
https://issues.apache.org/jira/browse/SPARK-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265533#comment-14265533
 ] 

Tathagata Das commented on SPARK-4960:
--------------------------------------

The reason we have I am suggesting the limited solution is that there are 
usecases where even T => Iterator[T[ helps. People have asked me before whether 
the data received from a source can be filtered even before it has been 
inserted to reduce memory usage. Others have also asked if they can do very low 
latency stuff like pushing received data out to some other store immediately 
for greater reliability. This simplified interceptor pattern can solve those. 
However, yes, it does not solve  [[email protected]] requirement. That should 
best be solved using a new Receiver and InputDStream.

This limited solution can be implemented without another receiver. The 
interceptor function, if set, can be applied by the BlockGenerator to every 
records it is getting. And since we want everyone to use the BlockGenerator, 
all receivers will be able to take advantage of this interceptor. 


> Interceptor pattern in receivers
> --------------------------------
>
>                 Key: SPARK-4960
>                 URL: https://issues.apache.org/jira/browse/SPARK-4960
>             Project: Spark
>          Issue Type: New Feature
>          Components: Streaming
>            Reporter: Tathagata Das
>
> Sometimes it is good to intercept a message received through a receiver and 
> modify / do something with the message before it is stored into Spark. This 
> is often referred to as the interceptor pattern. There should be general way 
> to specify an interceptor function that gets applied to all receivers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to