[
https://issues.apache.org/jira/browse/SPARK-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15125834#comment-15125834
]
sachin aggarwal commented on SPARK-13065:
-----------------------------------------
I would like to work on this , will issue a pull request soon..
> streaming-twitter pass twitter4j.FilterQuery argument to
> TwitterUtils.createStream()
> ------------------------------------------------------------------------------------
>
> Key: SPARK-13065
> URL: https://issues.apache.org/jira/browse/SPARK-13065
> Project: Spark
> Issue Type: Improvement
> Components: Streaming
> Affects Versions: 1.6.0
> Environment: all
> Reporter: Andrew Davidson
> Priority: Minor
> Labels: twitter
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> The twitter stream api is very powerful provides a lot of support for
> twitter.com side filtering of status objects. When ever possible we want to
> let twitter do as much work as possible for us.
> currently the spark twitter api only allows you to configure a small sub set
> of possible filters
> String{} filters = {"tag1", tag2"}
> JavaDStream<Status> tweets =TwitterUtils.createStream(ssc, twitterAuth,
> filters);
> The current implemenation does
> private[streaming]
> class TwitterReceiver(
> twitterAuth: Authorization,
> filters: Seq[String],
> storageLevel: StorageLevel
> ) extends Receiver[Status](storageLevel) with Logging {
> . . .
> val query = new FilterQuery
> if (filters.size > 0) {
> query.track(filters.mkString(","))
> newTwitterStream.filter(query)
> } else {
> newTwitterStream.sample()
> }
> ...
> rather than construct the FilterQuery object in TwitterReceiver.onStart(). we
> should be able to pass a FilterQueryObject
> looks like an easy fix. See source code links bellow
> kind regards
> Andy
> https://github.com/apache/spark/blob/master/external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala#L60
> https://github.com/apache/spark/blob/master/external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala#L89
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]