Hi everybody! Thank you Edward. I copy paste part of your answer:
["If your filter criteria are sufficiently narrow, you get *all* of the public tweets with those keywords sent by users who aren't being blocked by Twitter's quality filter." At least that's what the documentation has said in the past.] -Can anyone confirm this? -I think, taking Edward's approach, I've still the same problem : even taking a "very narrow" criteria I can never know what's the total, so I can'´t know if all the tweets got by streaming are useful or not. I think I have to remark that I don't need to know an exact total of tweets in a given moment. What I'd like to know is an approximate percentage over some approximate total of tweets estimation. I dare to think it's part of the "service providing specification". I do understand that it can be difficult to exactly define "total of tweets" when streaming and having tweets going into Twitter permanently but not constantly, but some estimated info would be great. Thank you all in advance. Alejandro. On Oct 11, 5:57 pm, "M. Edward (Ed) Borasky" <zn...@borasky- research.net> wrote: > Quoting AA <[email protected]>: > > > > > > > Hi everybody! > > I'm designing an app to do some mining over a corpus of tweets. > > I think I'll use streaming api, statuses/filter filtering by keywords. > > > I'd like to know, before starting development, what is the percentage > > of tweets delivered by this stream over the total tweets ('meaning > > total tweets' the total of tweets that have the tracking keywords) . > > This is information is crucial because of statistical confidence: a > > very little sample may not be significant. > > > Addittionally, Ive been googling and reading a lot for 3 days and I > > can't figure out how i can use different 'level accesses'. > > I've readhttp://dev.twitter.com/pages/streaming_api_methods#statuses-filter > > but how can I use this different levels levels of access? > > > Thanks in advance! > > Regards > > Alejandro. > > I actually think the answer to *yout* question is, "If your filter > criteria are sufficiently narrow, you get *all* of the public tweets > with those keywords sent by users who aren't being blocked by > Twitter's quality filter." At least that's what the documentation has > said in the past. > > But *my* question is, "How does one determine the total number of > tweets, for some definition of total? > > a. All tweets created, including those that aren't public? > b. All public tweets created, including those from "low quality users" > that don't get indexed by search or sent to the "filter" stream? > c. All tweets sent to the inlet of the filter stream and the various > elevated access level stream? > > Remind me again - when does "Snowflake" go live? I haven't looked at > Streaming data for a couple months. > > -- > M. Edward (Ed) Boraskyhttp://borasky-research.nethttp://twitter.com/znmeb > > "A mathematician is a device for turning coffee into theorems." - Paul Erdos -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
