[twitter-dev] Re: Throttling of filter stream

Robert Chatley Mon, 28 Sep 2009 03:34:11 -0700

Hi,

I also have a question regarding throttling of the streaming API when
tracking keywords.


We are successfully tracking keywords and reading messages, but would
like to know when our query is too broad, and we are not receiving all
the messages, so that we can back off. We would prefer to be getting
all the messages for a finer-grained query than most of the messages
for a broader one.

Is it possible for the client to tell whether its query is being
throttled? I checked the rate-limit data on the returned statuses, but
these didn't seem to give useful information for the streaming API - I
guess they only give data about GET requests to other APIs.

We are using the default access level.

regards,
Robert


On Sep 4, 4:20 am, John Kalucki <jkalu...@gmail.com> wrote:
> Zac,
>
> It's possible that the trackfilteris missing something, but there's
> probably other misunderstandings that are clouding things.
>
> I don't know how Tweespeed comes up with their numbers, but theStreamingAPI 
> only makes available a proportion of all public
> statuses. Spam accounts, for example, are filtered out, as are
> protected accounts, direct messages, etc. etc. My guess is that
> Tweespeed is assuming that status_ids are assigned sequentially and
> they are just reporting the velocity of that column.
>
> Your estimate that 40% of tweets contain a link seems more than 2x too
> high. You can come up with a very accurate number by collecting a
> sampled feed for a few hours or days (there are diurnal and daily
> patterns to everything on Twitter) and dividing out. Even 10 minutes
> of the default sampled feed (the old "spritzer") will give you an
> idea.
>
> Without knowing your sample size, day of week, or time of day, I'd say
> that your reported matches per minute and limited statuses per minute
> are pretty good. I don't think you are missing much, if anything,
> other than the statuses reported by the limit message.
>
> As a double check, I just ran a quick test with the highest level of
> track and compared the result against the firehose. In a one minute
> sample, the track feed had matched the same tweets as the firehose
> piped to 'grep -i http'.
>
> -John Kaluckihttp://twitter.com/jkalucki
> Services, Twitter Inc
>
> On Sep 3, 7:23 pm, Zac Witte <zacwi...@gmail.com> wrote:
>
> > I'm not sure thefilteris actually catching everything that I'm
> > supposedly tracking. There are ~20,000 tweets per minute right now
> > according to tweespeed. I'm getting about 1000 tweets/m and skipping
> > on average 1500 tweets/m according to the limit notifications. That
> > means myfilteris matching about 12.5% of all tweets, but I'm
> > tracking "http" and supposedly 40% of all tweets contain a link so my
> >filterwould seem to be missing the majority of all links. Is this
> > making sense?

[twitter-dev] Re: Throttling of filter stream

Reply via email to