Clemens Wolff created BAHIR-117:
-----------------------------------

             Summary: Expand filtering options for TwitterInputDStream
                 Key: BAHIR-117
                 URL: https://issues.apache.org/jira/browse/BAHIR-117
             Project: Bahir
          Issue Type: Improvement
          Components: Spark Streaming Connectors
            Reporter: Clemens Wolff
            Priority: Minor


Currently, the TwitterInputDStream only supports filtering by keywords [1] 
which corresponds to the "track" option in the Twitter API [2]. The Twitter API 
supports many more ways to receive a filtered stream (e.g. get Tweets in a 
particular location [3]). It would be very useful to expose these additional 
filtering options in this library.

Proposal: add a new public method to TwitterUtils which follows the same 
interface as createStream [4] but which takes a FilterQuery [5] object as 
argument. In this way, we give full filtering flexibility to our users.

I'm currently working on Project Fortis, a social data analysis platform for 
the United Nations [6]. The extra filtering options would be very useful for my 
project so I'm happy to implement this and create a pull request.

[1] 
https://github.com/apache/bahir/blob/fd4c35fc9f7ebb57464d231cf5d66e7bc4096a1b/streaming-twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala#L44
[2] https://dev.twitter.com/streaming/overview/request-parameters#track
[3] https://dev.twitter.com/streaming/overview/request-parameters#locations
[4] 
https://github.com/apache/bahir/blob/fd4c35fc9f7ebb57464d231cf5d66e7bc4096a1b/streaming-twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala#L39
[5] http://twitter4j.org/javadoc/twitter4j/FilterQuery.html
[6] https://fortis-web.azurewebsites.net/#/site/ocha/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to