[twitter-dev] No of statuses extracted by statuses/filter

2010-10-11 Thread AA
Hi everybody!
I'm designing an app to do some mining over a corpus of tweets.
I think I'll use streaming api, statuses/filter filtering by keywords.

I'd like to know, before starting development, what is the percentage
of tweets  delivered by this stream over the total tweets ('meaning
total tweets' the total of tweets that have the tracking keywords)  .
This is information is crucial because of statistical confidence: a
very little sample may not be significant.

Addittionally, Ive been googling and reading a lot for 3 days and I
can't figure out how i can use different 'level accesses'.
I've read http://dev.twitter.com/pages/streaming_api_methods#statuses-filter
but how can I use this different levels levels of access?

Thanks in advance!
Regards
Alejandro.

-- 
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: 
http://groups.google.com/group/twitter-development-talk


Re: [twitter-dev] No of statuses extracted by statuses/filter

2010-10-11 Thread M. Edward (Ed) Borasky

Quoting AA alejandro.ale...@gmail.com:


Hi everybody!
I'm designing an app to do some mining over a corpus of tweets.
I think I'll use streaming api, statuses/filter filtering by keywords.

I'd like to know, before starting development, what is the percentage
of tweets  delivered by this stream over the total tweets ('meaning
total tweets' the total of tweets that have the tracking keywords)  .
This is information is crucial because of statistical confidence: a
very little sample may not be significant.

Addittionally, Ive been googling and reading a lot for 3 days and I
can't figure out how i can use different 'level accesses'.
I've read http://dev.twitter.com/pages/streaming_api_methods#statuses-filter
but how can I use this different levels levels of access?

Thanks in advance!
Regards
Alejandro.


I actually think the answer to *yout* question is, If your filter  
criteria are sufficiently narrow, you get *all* of the public tweets  
with those keywords sent by users who aren't being blocked by  
Twitter's quality filter. At least that's what the documentation has  
said in the past.


But *my* question is, How does one determine the total number of  
tweets, for some definition of total?


a. All tweets created, including those that aren't public?
b. All public tweets created, including those from low quality users  
that don't get indexed by search or sent to the filter stream?
c. All tweets sent to the inlet of the filter stream and the various  
elevated access level stream?


Remind me again - when does Snowflake go live? I haven't looked at  
Streaming data for a couple months.


--
M. Edward (Ed) Borasky
http://borasky-research.net http://twitter.com/znmeb

A mathematician is a device for turning coffee into theorems. - Paul Erdos


--
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: 
http://groups.google.com/group/twitter-development-talk