Thank you everyone.
You've given me quite a few good options to look into.
Lucas
On Mon, Jul 5, 2010 at 5:57 AM, Jean-Charles Campagne
wrote:
> Hello Lucas,
>
> We do not provide, yet, exactly what you are looking for, but for now
> we might help you on the language filtering part.
> We provid
Hello Lucas,
We do not provide, yet, exactly what you are looking for, but for now
we might help you on the language filtering part.
We provide an API for language and location filtering for
micro-messages (Tweets and Facebook messages, etc.).
You'll find more info on the API website: http://deve
You are right. Separate subpopulation s are out of our reach.
Apart from following/friendship connection we look at mentions and follow
them as well.
If a new comer or a man from other population mentions one of the people in
our network, his tweet will reach us and we can test him and add as well
Interesting. Your method is similar to the breadth-first crawl that many people
do (for example, see the academic paper by Kwak et al. 2010).
You have to keep in mind, however, that you are only crawling the giant
component of the network, the connected part. If there are any turkish users
who
We have implemented the Turkish version: Twitturk
http://twitturk.com/home/lang/en
We skipped the first three steps but started with a few Turkish users and
crawled all the network and for each new user we tested if the description
or latest tweets are in Turkish language.
We have almost 100.000
John,
yes, thanks a lot for the design proposal - that is what inspired my own
system. I am not primarily filtering by language, however, but by country, so
I'm using time zone and location data together with a list of cities from
http://www.geonames.org/
The manual cross-check in my thesis sh
It's great to hear that someone implemented all this. There's a similar
technique documented here:
http://dev.twitter.com/pages/streaming_api_concepts, under By Language and
Country. My suggestion was to start with a list of stop words to build your
user corpus -- but I don't know how well Farsi wo
Hi Lucas,
as someone who approached a similar problem, my recommendation would be to
track users. In order to get results quickly (rather than every few hours via
user timeline calls), you need streaming access, which is a bit more
complicated. I implemented such a system in order to track the
Hello,
I am trying to create an app that will show tweets and trends in
Farsi, for native speakers. I would like to somehow get a sample
'garden hose' of Farsi based tweets, but I am unable to come up with
an elegant solution.
I see the following options:
- Sample all tweets, and run a language