That's a good explanation, thanks Mark. In your example are those 50 tweets gone forever or are they buffered into the following minute? I haven't seen the limit message yet.
I realize there is the count parameter which allows you to go back 150k, but it seems it doesn't apply to track accounts (i'm restricted track). Even if I were to open a second Shadow account to swap with in the event of a limit message, my understanding is that Shadow is only increased followers but not track keywords. On Jan 13, 7:02 pm, Mark McBride <[email protected]> wrote: > Check out the filter URL on the streaming API. It will return up to N > tweets a minute, where N is the amount you'd get from a sampled > stream. However it only returns tweets that match track keywords. > Provided the number of filtered tweets is never above the sampled > amount, you won't get limited. > > Let's take a hypothetical example. Using gardenhose you're throttled > at 100 tweets a minute (not the real number). You track the keyword > "twitter". During the first minute there are 50 matches. You get all > 50. During the second minute there are 150 tweets about twitter. > You'll get 100 tweets, and a limit message saying there were 50 more > you missed due to throttling. Does this make sense? > > ---Mark > > http://twitter.com/mccv > > On Wed, Jan 13, 2010 at 10:55 AM, Ross Bates <[email protected]> wrote: > > I'm reading the streaming API documentation and have a question about > > track keywords. A set of keywords can be used to filter the gardenhose > > but it doesn't actually increase your chance of getting tweets that > > would not have been included in the unfiltered stream. The gardenhose > > is a sample of the firehose and returns the same results to all > > clients - correct? > > > If this is the case then for applications that need all data for > > specific keywords I would think the search API remains the better > > option? For example, if I needed all tweets that contained the words > > foo OR bar the gardenhose can't guarantee I will get 100%. > > > What's confusing me is the email which went out the other day about > > the streaming API. First the statement about polling for keywords: > > > "If your application polls for keywords, mentions, is whitelisted on > > the > > Search API, or makes more than perhaps 10 queries per minute, you > > should > > begin your migration to Streaming. Desktop clients should postpone a > > migration to Streaming." > > > Then later in the email: > > > "Complete corpus search: Search is focused on result set quality and > > there are no guarantees to return all matching tweets. Complete > > results > > are only available on the Streaming API. Search results are > > increasingly > > filtered and reordered for relevance." > > > This second statement differs from the streaming API documentation > > which says that the streaming API is sampled. > > > Does the rollout of the streaming API to the general public mean that > > results are no longer sampled? > > > -Ross
