ASF GitHub Bot commented on BAHIR-117:

GitHub user c-w opened a pull request:


    [BAHIR-117] Expand filtering options for TwitterInputDStream

    This pull request adds a new method to TwitterUtils that enables users to 
pass an arbitrary FilterQuery down to the TwitterReceiver.
    This enables use-cases like receiving Tweets based on location, based on 
handle, etc. Previously users were only able to receive Tweets based on 
disjunctive keyword queries.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/c-w/bahir bahir-117

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #43
commit 25be6a38f6ddd7e59ce198acf3b1f468111aafe1
Author: Clemens Wolff <clewo...@microsoft.com>
Date:   2017-05-04T19:52:54Z

    Fix checkstyle violation

commit 4c5124501fe0e3d799c7d37d67e07b175a353b72
Author: Clemens Wolff <clewo...@microsoft.com>
Date:   2017-05-04T20:09:26Z

    Add stream creation method with arbitrary query


> Expand filtering options for TwitterInputDStream
> ------------------------------------------------
>                 Key: BAHIR-117
>                 URL: https://issues.apache.org/jira/browse/BAHIR-117
>             Project: Bahir
>          Issue Type: Improvement
>          Components: Spark Streaming Connectors
>            Reporter: Clemens Wolff
> Currently, the TwitterInputDStream only supports filtering by keywords [1] 
> which corresponds to the "track" option in the Twitter API [2]. The Twitter 
> API supports many more ways to receive a filtered stream (e.g. get Tweets in 
> a particular location [3]). It would be very useful to expose these 
> additional filtering options in this library.
> Proposal: add a new public method to TwitterUtils which follows the same 
> interface as createStream [4] but which takes a FilterQuery [5] object as 
> argument. In this way, we give full filtering flexibility to our users.
> I'm currently working on Project Fortis, a social data analysis platform for 
> the United Nations [6]. The extra filtering options would be very useful for 
> my project so I'm happy to implement this and create a pull request.
> [1] 
> https://github.com/apache/bahir/blob/fd4c35fc9f7ebb57464d231cf5d66e7bc4096a1b/streaming-twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala#L44
> [2] https://dev.twitter.com/streaming/overview/request-parameters#track
> [3] https://dev.twitter.com/streaming/overview/request-parameters#locations
> [4] 
> https://github.com/apache/bahir/blob/fd4c35fc9f7ebb57464d231cf5d66e7bc4096a1b/streaming-twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala#L39
> [5] http://twitter4j.org/javadoc/twitter4j/FilterQuery.html
> [6] https://fortis-web.azurewebsites.net/#/site/ocha/

This message was sent by Atlassian JIRA

Reply via email to