Hello, I am trying to create an app that will show tweets and trends in Farsi, for native speakers. I would like to somehow get a sample 'garden hose' of Farsi based tweets, but I am unable to come up with an elegant solution.
I see the following options: - Sample all tweets, and run a language detection algorithm on the tweet to determine which are/could be Farsi. * Problem: only a very very small % of the tweets will be in Farsi - Use the location filter to try and sample tweets from countries that are known to speak Farsi, and then run a language detection algorithm on the tweets. * Problem: I seem to be limited on the size of the coordinate box I can provide. I can not even cover all of Iran for example. - Filter a standard farsi term. * Problem: will limit my results to only tweets with this term - Search for laguage = farsi * Problem: Not a stream, I will need to keep searching. I think of the given options I mentioned what makes the most sense is to search for tweets where language=farsi, and use the since_id to keep my results new. Given this method, I have three questions 1 - since_id I imagine is the highest tweet_id from the previous result set? 2 - How often can I search (given API limits of course) in order to ensure I get new data? 3 - Will the language filter provide me with users who's default language is farsi, or will it actually find tweets in farsi? I am aware that the user can select their native language in the user profile, but I also know this is not 100% reliable. Can anyone think of a more elegant solution? Are there any hidden/experimental language type filters available to us? Thanks! Lucas