Re: [twitter-dev] Tips to avoid hitting rate limits for my movie monitoring application.

2010-03-10 Thread Rahul Dighe
thanks - I need to put more thought into this - I am inclined to feel that
at the moment that the search api will probably deliver better resuls - as
the cost of filtering thousands and thousands of records for even something
as basic as a movie called New York or Independence Day split into
independent words will probably be cost intensive and might end up being
looking for a needle in the haystack.

Having said that I think Twitter has surely come up with this API with good
thought - it's just needs further analysis from my end with regards to
whether the cost of filtering outweigh the benefits from getting real time
streaming resuls.

thanks
rahul.

On Tue, Mar 9, 2010 at 9:10 PM, Mark McBride mmcbr...@twitter.com wrote:

 This is correct.  The general advice is to choose the most specific keyword
 to track (probably locker and blind in this case), then run an
 additional layer of filtering on your side.  There are higher access levels
 available that grant you more than 200 keywords to track.

   ---Mark

 http://twitter.com/mccv


 On Tue, Mar 9, 2010 at 12:36 PM, Rahul Dighe rsdigh...@googlemail.comwrote:

 Hello,

 Correct me if I am wrong but doesn't the streaming API has limitation that
 allow me to only track 200 keywords.. and also with the added caveat that -

 *Track keywords are case-insensitive logical ORs. Terms are
 exact-matched, and also exact-matched ignoring punctuation. Phrases,
 keywords with spaces, are not supported. Keywords containing punctuation
 will only exact match tokens. Some UTF-8 keywords will not match correctly-
 this is a known temporary defect.*

 If this is the case how will the api track keywords such as The Hurt
 Locker or The Blind Side?

 Thanks
 Rahul Dighe



 On Mon, Mar 8, 2010 at 11:42 PM, Mark McBride mmcbr...@twitter.comwrote:

 This sounds like a perfect use case for the streaming API.  The rate
 limits there are different, but in general more permissive. And because
 you're doing primarily OR queries, the current track functionality seems
 sufficient.

   ---Mark

 http://twitter.com/mccv



 On Mon, Mar 8, 2010 at 3:21 PM, Rahul rsdigh...@googlemail.com wrote:

 Hello,

 I am building an application that monitors tweets about movies(for now
 with... other interesting things planned). I have my id whitelisted
 but I want to avoid overusing it.

 The challenge that I face is that ideally I want to make full use of
 the opportunity to retrieve 100 tweets per call and for that I need
 information on the frequency with which users are tweeting about a
 movie and then set my call frequency (to call twitter search api)
 accordingly so that I maximize the number of tweets returned per call
 or atleast.

 Since I presume there is no way to know what frequency is someone
 tweeting about a movie - I need help is what is the best way to
 optimize for such a situation.

 The challenge is complicated by the fact that users tweet about
 different movies at different rates and the rates generally decrease
 overtime.

 I have tried combining searches - but the challenge is that lets say I
 search for

 (Movie A OR Movie B)
 (Movie C OR Movie D)

 it could be the case that people tweet about Movie A  B a lot and
 litle to none about C or D or there is a combination in which they
 continue to tweet about A but not about B - So I still can end up in a
 situation where I am not optimizing my calls. Also situations such as
 Oscars can dramatically change what people talk about even about
 movies out months ago.

 I have thought of writing something such as a variable frequency
 caller that can check the frequency of tweets for the last 3 calls in
 order to appreciate the frequency of tweets for a given search and
 then continuously vary the time between calls so that I can get as
 close to 100 tweets as possible in a call.

 Any ideas suggestions that can suggest ways to alleviate the above
 will be highly appreciated.

 Thanks
 Rahul.







Re: [twitter-dev] Tips to avoid hitting rate limits for my movie monitoring application.

2010-03-10 Thread John Kalucki
We'd like to offer phrase search, or at least AND search on the Streaming
API, but we've had other priorities recently.

Note that Search is not intended for repeated automated keyword queries, and
that Search results are filtered for relevance. If you need all the Tweets,
or if you need them in real-time, the Streaming API is the best answer. The
Search API is mostly intended for complex, historical backfill, ad hoc, and
direct-display-to-user queries.

-John Kalucki
http://twitter.com/jkalucki
Infrastructure, Twitter Inc.



On Wed, Mar 10, 2010 at 12:23 AM, Rahul Dighe rsdigh...@googlemail.comwrote:

 thanks - I need to put more thought into this - I am inclined to feel that
 at the moment that the search api will probably deliver better resuls - as
 the cost of filtering thousands and thousands of records for even something
 as basic as a movie called New York or Independence Day split into
 independent words will probably be cost intensive and might end up being
 looking for a needle in the haystack.

 Having said that I think Twitter has surely come up with this API with good
 thought - it's just needs further analysis from my end with regards to
 whether the cost of filtering outweigh the benefits from getting real time
 streaming resuls.

 thanks
 rahul.

 On Tue, Mar 9, 2010 at 9:10 PM, Mark McBride mmcbr...@twitter.com wrote:

 This is correct.  The general advice is to choose the most specific
 keyword to track (probably locker and blind in this case), then run an
 additional layer of filtering on your side.  There are higher access levels
 available that grant you more than 200 keywords to track.

   ---Mark

 http://twitter.com/mccv


 On Tue, Mar 9, 2010 at 12:36 PM, Rahul Dighe rsdigh...@googlemail.comwrote:

 Hello,

 Correct me if I am wrong but doesn't the streaming API has limitation
 that allow me to only track 200 keywords.. and also with the added caveat
 that -

 *Track keywords are case-insensitive logical ORs. Terms are
 exact-matched, and also exact-matched ignoring punctuation. Phrases,
 keywords with spaces, are not supported. Keywords containing punctuation
 will only exact match tokens. Some UTF-8 keywords will not match correctly-
 this is a known temporary defect.*

 If this is the case how will the api track keywords such as The Hurt
 Locker or The Blind Side?

 Thanks
 Rahul Dighe



 On Mon, Mar 8, 2010 at 11:42 PM, Mark McBride mmcbr...@twitter.comwrote:

 This sounds like a perfect use case for the streaming API.  The rate
 limits there are different, but in general more permissive. And because
 you're doing primarily OR queries, the current track functionality seems
 sufficient.

   ---Mark

 http://twitter.com/mccv



 On Mon, Mar 8, 2010 at 3:21 PM, Rahul rsdigh...@googlemail.com wrote:

 Hello,

 I am building an application that monitors tweets about movies(for now
 with... other interesting things planned). I have my id whitelisted
 but I want to avoid overusing it.

 The challenge that I face is that ideally I want to make full use of
 the opportunity to retrieve 100 tweets per call and for that I need
 information on the frequency with which users are tweeting about a
 movie and then set my call frequency (to call twitter search api)
 accordingly so that I maximize the number of tweets returned per call
 or atleast.

 Since I presume there is no way to know what frequency is someone
 tweeting about a movie - I need help is what is the best way to
 optimize for such a situation.

 The challenge is complicated by the fact that users tweet about
 different movies at different rates and the rates generally decrease
 overtime.

 I have tried combining searches - but the challenge is that lets say I
 search for

 (Movie A OR Movie B)
 (Movie C OR Movie D)

 it could be the case that people tweet about Movie A  B a lot and
 litle to none about C or D or there is a combination in which they
 continue to tweet about A but not about B - So I still can end up in a
 situation where I am not optimizing my calls. Also situations such as
 Oscars can dramatically change what people talk about even about
 movies out months ago.

 I have thought of writing something such as a variable frequency
 caller that can check the frequency of tweets for the last 3 calls in
 order to appreciate the frequency of tweets for a given search and
 then continuously vary the time between calls so that I can get as
 close to 100 tweets as possible in a call.

 Any ideas suggestions that can suggest ways to alleviate the above
 will be highly appreciated.

 Thanks
 Rahul.








Re: [twitter-dev] Tips to avoid hitting rate limits for my movie monitoring application.

2010-03-10 Thread Rahul Dighe
thanks john - I have not considered the implication of search results being
returned by relevance - I will give the streaming API a shot -

On Wed, Mar 10, 2010 at 2:28 PM, John Kalucki j...@twitter.com wrote:

 We'd like to offer phrase search, or at least AND search on the Streaming
 API, but we've had other priorities recently.

 Note that Search is not intended for repeated automated keyword queries,
 and that Search results are filtered for relevance. If you need all the
 Tweets, or if you need them in real-time, the Streaming API is the best
 answer. The Search API is mostly intended for complex, historical backfill,
 ad hoc, and direct-display-to-user queries.

 -John Kalucki
 http://twitter.com/jkalucki
 Infrastructure, Twitter Inc.




 On Wed, Mar 10, 2010 at 12:23 AM, Rahul Dighe rsdigh...@googlemail.comwrote:

 thanks - I need to put more thought into this - I am inclined to feel that
 at the moment that the search api will probably deliver better resuls - as
 the cost of filtering thousands and thousands of records for even something
 as basic as a movie called New York or Independence Day split into
 independent words will probably be cost intensive and might end up being
 looking for a needle in the haystack.

 Having said that I think Twitter has surely come up with this API with
 good thought - it's just needs further analysis from my end with regards to
 whether the cost of filtering outweigh the benefits from getting real time
 streaming resuls.

 thanks
 rahul.

 On Tue, Mar 9, 2010 at 9:10 PM, Mark McBride mmcbr...@twitter.comwrote:

 This is correct.  The general advice is to choose the most specific
 keyword to track (probably locker and blind in this case), then run an
 additional layer of filtering on your side.  There are higher access levels
 available that grant you more than 200 keywords to track.

   ---Mark

 http://twitter.com/mccv


 On Tue, Mar 9, 2010 at 12:36 PM, Rahul Dighe 
 rsdigh...@googlemail.comwrote:

 Hello,

 Correct me if I am wrong but doesn't the streaming API has limitation
 that allow me to only track 200 keywords.. and also with the added caveat
 that -

 *Track keywords are case-insensitive logical ORs. Terms are
 exact-matched, and also exact-matched ignoring punctuation. Phrases,
 keywords with spaces, are not supported. Keywords containing
 punctuation will only exact match tokens. Some UTF-8 keywords will not 
 match
 correctly- this is a known temporary defect.*

 If this is the case how will the api track keywords such as The Hurt
 Locker or The Blind Side?

 Thanks
 Rahul Dighe



 On Mon, Mar 8, 2010 at 11:42 PM, Mark McBride mmcbr...@twitter.comwrote:

 This sounds like a perfect use case for the streaming API.  The rate
 limits there are different, but in general more permissive. And because
 you're doing primarily OR queries, the current track functionality seems
 sufficient.

   ---Mark

 http://twitter.com/mccv



 On Mon, Mar 8, 2010 at 3:21 PM, Rahul rsdigh...@googlemail.comwrote:

 Hello,

 I am building an application that monitors tweets about movies(for now
 with... other interesting things planned). I have my id whitelisted
 but I want to avoid overusing it.

 The challenge that I face is that ideally I want to make full use of
 the opportunity to retrieve 100 tweets per call and for that I need
 information on the frequency with which users are tweeting about a
 movie and then set my call frequency (to call twitter search api)
 accordingly so that I maximize the number of tweets returned per call
 or atleast.

 Since I presume there is no way to know what frequency is someone
 tweeting about a movie - I need help is what is the best way to
 optimize for such a situation.

 The challenge is complicated by the fact that users tweet about
 different movies at different rates and the rates generally decrease
 overtime.

 I have tried combining searches - but the challenge is that lets say I
 search for

 (Movie A OR Movie B)
 (Movie C OR Movie D)

 it could be the case that people tweet about Movie A  B a lot and
 litle to none about C or D or there is a combination in which they
 continue to tweet about A but not about B - So I still can end up in a
 situation where I am not optimizing my calls. Also situations such as
 Oscars can dramatically change what people talk about even about
 movies out months ago.

 I have thought of writing something such as a variable frequency
 caller that can check the frequency of tweets for the last 3 calls in
 order to appreciate the frequency of tweets for a given search and
 then continuously vary the time between calls so that I can get as
 close to 100 tweets as possible in a call.

 Any ideas suggestions that can suggest ways to alleviate the above
 will be highly appreciated.

 Thanks
 Rahul.









Re: [twitter-dev] Tips to avoid hitting rate limits for my movie monitoring application.

2010-03-09 Thread Rahul Dighe
Hello,

Correct me if I am wrong but doesn't the streaming API has limitation that
allow me to only track 200 keywords.. and also with the added caveat that -

*Track keywords are case-insensitive logical ORs. Terms are exact-matched,
and also exact-matched ignoring punctuation. Phrases, keywords with spaces,
are not supported. Keywords containing punctuation will only exact match
tokens. Some UTF-8 keywords will not match correctly- this is a known
temporary defect.*

If this is the case how will the api track keywords such as The Hurt
Locker or The Blind Side?

Thanks
Rahul Dighe


On Mon, Mar 8, 2010 at 11:42 PM, Mark McBride mmcbr...@twitter.com wrote:

 This sounds like a perfect use case for the streaming API.  The rate limits
 there are different, but in general more permissive. And because you're
 doing primarily OR queries, the current track functionality seems
 sufficient.

   ---Mark

 http://twitter.com/mccv



 On Mon, Mar 8, 2010 at 3:21 PM, Rahul rsdigh...@googlemail.com wrote:

 Hello,

 I am building an application that monitors tweets about movies(for now
 with... other interesting things planned). I have my id whitelisted
 but I want to avoid overusing it.

 The challenge that I face is that ideally I want to make full use of
 the opportunity to retrieve 100 tweets per call and for that I need
 information on the frequency with which users are tweeting about a
 movie and then set my call frequency (to call twitter search api)
 accordingly so that I maximize the number of tweets returned per call
 or atleast.

 Since I presume there is no way to know what frequency is someone
 tweeting about a movie - I need help is what is the best way to
 optimize for such a situation.

 The challenge is complicated by the fact that users tweet about
 different movies at different rates and the rates generally decrease
 overtime.

 I have tried combining searches - but the challenge is that lets say I
 search for

 (Movie A OR Movie B)
 (Movie C OR Movie D)

 it could be the case that people tweet about Movie A  B a lot and
 litle to none about C or D or there is a combination in which they
 continue to tweet about A but not about B - So I still can end up in a
 situation where I am not optimizing my calls. Also situations such as
 Oscars can dramatically change what people talk about even about
 movies out months ago.

 I have thought of writing something such as a variable frequency
 caller that can check the frequency of tweets for the last 3 calls in
 order to appreciate the frequency of tweets for a given search and
 then continuously vary the time between calls so that I can get as
 close to 100 tweets as possible in a call.

 Any ideas suggestions that can suggest ways to alleviate the above
 will be highly appreciated.

 Thanks
 Rahul.





Re: [twitter-dev] Tips to avoid hitting rate limits for my movie monitoring application.

2010-03-09 Thread Mark McBride
This is correct.  The general advice is to choose the most specific keyword
to track (probably locker and blind in this case), then run an
additional layer of filtering on your side.  There are higher access levels
available that grant you more than 200 keywords to track.

  ---Mark

http://twitter.com/mccv


On Tue, Mar 9, 2010 at 12:36 PM, Rahul Dighe rsdigh...@googlemail.comwrote:

 Hello,

 Correct me if I am wrong but doesn't the streaming API has limitation that
 allow me to only track 200 keywords.. and also with the added caveat that -

 *Track keywords are case-insensitive logical ORs. Terms are exact-matched,
 and also exact-matched ignoring punctuation. Phrases, keywords with
 spaces, are not supported. Keywords containing punctuation will only exact
 match tokens. Some UTF-8 keywords will not match correctly- this is a known
 temporary defect.*

 If this is the case how will the api track keywords such as The Hurt
 Locker or The Blind Side?

 Thanks
 Rahul Dighe



 On Mon, Mar 8, 2010 at 11:42 PM, Mark McBride mmcbr...@twitter.comwrote:

 This sounds like a perfect use case for the streaming API.  The rate
 limits there are different, but in general more permissive. And because
 you're doing primarily OR queries, the current track functionality seems
 sufficient.

   ---Mark

 http://twitter.com/mccv



 On Mon, Mar 8, 2010 at 3:21 PM, Rahul rsdigh...@googlemail.com wrote:

 Hello,

 I am building an application that monitors tweets about movies(for now
 with... other interesting things planned). I have my id whitelisted
 but I want to avoid overusing it.

 The challenge that I face is that ideally I want to make full use of
 the opportunity to retrieve 100 tweets per call and for that I need
 information on the frequency with which users are tweeting about a
 movie and then set my call frequency (to call twitter search api)
 accordingly so that I maximize the number of tweets returned per call
 or atleast.

 Since I presume there is no way to know what frequency is someone
 tweeting about a movie - I need help is what is the best way to
 optimize for such a situation.

 The challenge is complicated by the fact that users tweet about
 different movies at different rates and the rates generally decrease
 overtime.

 I have tried combining searches - but the challenge is that lets say I
 search for

 (Movie A OR Movie B)
 (Movie C OR Movie D)

 it could be the case that people tweet about Movie A  B a lot and
 litle to none about C or D or there is a combination in which they
 continue to tweet about A but not about B - So I still can end up in a
 situation where I am not optimizing my calls. Also situations such as
 Oscars can dramatically change what people talk about even about
 movies out months ago.

 I have thought of writing something such as a variable frequency
 caller that can check the frequency of tweets for the last 3 calls in
 order to appreciate the frequency of tweets for a given search and
 then continuously vary the time between calls so that I can get as
 close to 100 tweets as possible in a call.

 Any ideas suggestions that can suggest ways to alleviate the above
 will be highly appreciated.

 Thanks
 Rahul.






[twitter-dev] Tips to avoid hitting rate limits for my movie monitoring application.

2010-03-08 Thread Rahul
Hello,

I am building an application that monitors tweets about movies(for now
with... other interesting things planned). I have my id whitelisted
but I want to avoid overusing it.

The challenge that I face is that ideally I want to make full use of
the opportunity to retrieve 100 tweets per call and for that I need
information on the frequency with which users are tweeting about a
movie and then set my call frequency (to call twitter search api)
accordingly so that I maximize the number of tweets returned per call
or atleast.

Since I presume there is no way to know what frequency is someone
tweeting about a movie - I need help is what is the best way to
optimize for such a situation.

The challenge is complicated by the fact that users tweet about
different movies at different rates and the rates generally decrease
overtime.

I have tried combining searches - but the challenge is that lets say I
search for

(Movie A OR Movie B)
(Movie C OR Movie D)

it could be the case that people tweet about Movie A  B a lot and
litle to none about C or D or there is a combination in which they
continue to tweet about A but not about B - So I still can end up in a
situation where I am not optimizing my calls. Also situations such as
Oscars can dramatically change what people talk about even about
movies out months ago.

I have thought of writing something such as a variable frequency
caller that can check the frequency of tweets for the last 3 calls in
order to appreciate the frequency of tweets for a given search and
then continuously vary the time between calls so that I can get as
close to 100 tweets as possible in a call.

Any ideas suggestions that can suggest ways to alleviate the above
will be highly appreciated.

Thanks
Rahul.



Re: [twitter-dev] Tips to avoid hitting rate limits for my movie monitoring application.

2010-03-08 Thread Mark McBride
This sounds like a perfect use case for the streaming API.  The rate limits
there are different, but in general more permissive. And because you're
doing primarily OR queries, the current track functionality seems
sufficient.

  ---Mark

http://twitter.com/mccv


On Mon, Mar 8, 2010 at 3:21 PM, Rahul rsdigh...@googlemail.com wrote:

 Hello,

 I am building an application that monitors tweets about movies(for now
 with... other interesting things planned). I have my id whitelisted
 but I want to avoid overusing it.

 The challenge that I face is that ideally I want to make full use of
 the opportunity to retrieve 100 tweets per call and for that I need
 information on the frequency with which users are tweeting about a
 movie and then set my call frequency (to call twitter search api)
 accordingly so that I maximize the number of tweets returned per call
 or atleast.

 Since I presume there is no way to know what frequency is someone
 tweeting about a movie - I need help is what is the best way to
 optimize for such a situation.

 The challenge is complicated by the fact that users tweet about
 different movies at different rates and the rates generally decrease
 overtime.

 I have tried combining searches - but the challenge is that lets say I
 search for

 (Movie A OR Movie B)
 (Movie C OR Movie D)

 it could be the case that people tweet about Movie A  B a lot and
 litle to none about C or D or there is a combination in which they
 continue to tweet about A but not about B - So I still can end up in a
 situation where I am not optimizing my calls. Also situations such as
 Oscars can dramatically change what people talk about even about
 movies out months ago.

 I have thought of writing something such as a variable frequency
 caller that can check the frequency of tweets for the last 3 calls in
 order to appreciate the frequency of tweets for a given search and
 then continuously vary the time between calls so that I can get as
 close to 100 tweets as possible in a call.

 Any ideas suggestions that can suggest ways to alleviate the above
 will be highly appreciated.

 Thanks
 Rahul.




Re: [twitter-dev] Tips to avoid hitting rate limits for my movie monitoring application.

2010-03-08 Thread M. Edward (Ed) Borasky
What would make use of Streaming for this use case a *lot* easier  
would be if Twitter would export to the API more detailed information  
about the Trending Topics. For example, I'd like to see more topics  
than just the current number displayed, and tweets per unit time  
(hourly worst case) for each topic. I'd like to see at least the Top  
100 and maybe even the Top 1000! This seems to me to be an easy task -  
you've got to do the computations anyway, right? Heck, with pages /  
cursors, you could send the whole table out and let people do their  
own cutoffs.


For example, over the weekend, the Trending Topics were,  
understandably, dominated by the Oscars. That's ten or twenty right  
there, by the time you factor in the fact that Farah Fawcett got  
ignored in the memorial, ten pictures nominated for best picture, ten  
Best / Best Supporting actresses, ten actors, etc. Throw the perennial  
Justin Bieber and Lady Gaga into the mix and it's clear there's  
interesting and useful information further down the list. Why should  
we have to monitor Streaming and do our own topic analysis and  
filtering, or subscribe to some service with Firehose access?


--
M. Edward (Ed) Borasky
borasky-research.net/m-edward-ed-borasky/

A mathematician is a device for turning coffee into theorems. ~ Paul Erdos


Quoting Mark McBride mmcbr...@twitter.com:


This sounds like a perfect use case for the streaming API.  The rate limits
there are different, but in general more permissive. And because you're
doing primarily OR queries, the current track functionality seems
sufficient.

  ---Mark

http://twitter.com/mccv


On Mon, Mar 8, 2010 at 3:21 PM, Rahul rsdigh...@googlemail.com wrote:


Hello,

I am building an application that monitors tweets about movies(for now
with... other interesting things planned). I have my id whitelisted
but I want to avoid overusing it.

The challenge that I face is that ideally I want to make full use of
the opportunity to retrieve 100 tweets per call and for that I need
information on the frequency with which users are tweeting about a
movie and then set my call frequency (to call twitter search api)
accordingly so that I maximize the number of tweets returned per call
or atleast.

Since I presume there is no way to know what frequency is someone
tweeting about a movie - I need help is what is the best way to
optimize for such a situation.

The challenge is complicated by the fact that users tweet about
different movies at different rates and the rates generally decrease
overtime.

I have tried combining searches - but the challenge is that lets say I
search for

(Movie A OR Movie B)
(Movie C OR Movie D)

it could be the case that people tweet about Movie A  B a lot and
litle to none about C or D or there is a combination in which they
continue to tweet about A but not about B - So I still can end up in a
situation where I am not optimizing my calls. Also situations such as
Oscars can dramatically change what people talk about even about
movies out months ago.

I have thought of writing something such as a variable frequency
caller that can check the frequency of tweets for the last 3 calls in
order to appreciate the frequency of tweets for a given search and
then continuously vary the time between calls so that I can get as
close to 100 tweets as possible in a call.

Any ideas suggestions that can suggest ways to alleviate the above
will be highly appreciated.

Thanks
Rahul.