[twitter-dev] Re: Twitter 1500 search results

2010-06-07 Thread sahmed10
I have been trying a certain algorithm but havent succeeded and
getting a http 403 response.
The algorithm is something like set a date say 2nd June to 4th June
with a string search query.

Call the search method and consume the first 1500 results , as the
tweets have status ids and you can limit the searches with the maxid
and sinceid parameter, get the sinceid from result and set it as the
maxid for the preceding 1500 tweets in time and so on.
Is this approach gonna work.



On Jun 7, 4:24 pm, Pascal Jürgens
lists.pascal.juerg...@googlemail.com wrote:
 As stated in the API WIKI, the number of search results you can get at any 
 given point in time for one search term is indeed ~1500.
 (http://apiwiki.twitter.com/Twitter-Search-API-Method:-search)

 There are several ways to go beyond that.

 a) Do perpetual searches (say, one every day), and merge the results
 b) Get streaming access and track keywords in real time
 c) Vary search terms and combine the results

 Good luck.

 On Jun 7, 2010, at 22:53 , sahmed10 wrote:

  I am developing an application where i am trying to get more than 1500
  results for a search query. Is it possible?
  For example when i specify return all result from 2nd June to 6th June
  with the search string of iphone i only get 1500 latest tweets, but
  on the other hand i am interested in all the tweets which have metion
  of iphone from 2nd June to 6th June.Is there a work around this?


[twitter-dev] Re: Twitter 1500 search results

2010-06-07 Thread sahmed10
yes it works! This algorithm works
Its something like this
Set the query to a string with appropriate To and From dates. Then
consuem the 1500 streaming results and also save the status id of the
very last tweet you got. As they are in order sequentially(with gaps)
it wont be a problem. The very last tweet status id should be assigned
as the MaxId for the next set of results and so on.


On Jun 7, 4:44 pm, sahmed10 sahme...@luc.edu wrote:
 I have been trying a certain algorithm but havent succeeded and
 getting a http 403 response.
 The algorithm is something like set a date say 2nd June to 4th June
 with a string search query.

 Call the search method and consume the first 1500 results , as the
 tweets have status ids and you can limit the searches with the maxid
 and sinceid parameter, get the sinceid from result and set it as the
 maxid for the preceding 1500 tweets in time and so on.
 Is this approach gonna work.

 On Jun 7, 4:24 pm, Pascal Jürgens

 lists.pascal.juerg...@googlemail.com wrote:
  As stated in the API WIKI, the number of search results you can get at any 
  given point in time for one search term is indeed ~1500.
  (http://apiwiki.twitter.com/Twitter-Search-API-Method:-search)

  There are several ways to go beyond that.

  a) Do perpetual searches (say, one every day), and merge the results
  b) Get streaming access and track keywords in real time
  c) Vary search terms and combine the results

  Good luck.

  On Jun 7, 2010, at 22:53 , sahmed10 wrote:

   I am developing an application where i am trying to get more than 1500
   results for a search query. Is it possible?
   For example when i specify return all result from 2nd June to 6th June
   with the search string of iphone i only get 1500 latest tweets, but
   on the other hand i am interested in all the tweets which have metion
   of iphone from 2nd June to 6th June.Is there a work around this?


Re: [twitter-dev] Re: Twitter 1500 search results

2010-06-07 Thread Pascal Jürgens
Good to know. Did you mean to say consume … streaming results? I don't really 
see where you use the stream here.

Also, please note that it's not a good idea to work with since_id and 
max_id any more, because those will soon be (already are?) NON-SEQUENTIAL. 
This means you will lose tweets if you rely on the IDs incrementing over time. 
To quote the relevant email from Taylor Singletary:

 Please don't depend on the exact format of the ID. As our infrastructure 
 needs evolve, we might need to tweak the generation algorithm again.
 
 If you've been trying to divine meaning from status IDs aside from their role 
 as a primary key, you won't be able to anymore. Likewise for usage of IDs in 
 mathematical operations -- for instance, subtracting two status IDs to 
 determine the number of tweets in between will no longer be possible

Cheers.

On Jun 8, 2010, at 0:06 , sahmed10 wrote:

 yes it works! This algorithm works
 Its something like this
 Set the query to a string with appropriate To and From dates. Then
 consuem the 1500 streaming results and also save the status id of the
 very last tweet you got. As they are in order sequentially(with gaps)
 it wont be a problem. The very last tweet status id should be assigned
 as the MaxId for the next set of results and so on.



Re: [twitter-dev] Re: Twitter 1500 search results

2010-06-07 Thread John Kalucki
Pascal,

These assumptions about since_id and max_id are incorrect. You can
still, and must still, rely upon them for fetching. The additional
jitter introduced by the id generation scheme is statistically
insignificant and very small compared to other reordering effects in
the Twitter system. Tweets are K-ordered over a multiple second window
as they are today, whereas the additional K introduced by the ID
generation system will be sub-second, if not sub-millisecond, and
practically irrelevant.

If you are doing repeated automated queries against the Search API,
you should transition to streaming. If you are attempting to get every
tweet that matches, which is clearly the case given the questions
below, transitioning to streaming is your only option, as search is
already filtering for relevance and this filtering will only increase
over time.

-John Kalucki
http://twitter.com/jkalucki
Infrastructure, Twitter Inc.






2010/6/7 Pascal Jürgens lists.pascal.juerg...@googlemail.com:
 Good to know. Did you mean to say consume … streaming results? I don't 
 really see where you use the stream here.

 Also, please note that it's not a good idea to work with since_id and 
 max_id any more, because those will soon be (already are?) NON-SEQUENTIAL. 
 This means you will lose tweets if you rely on the IDs incrementing over 
 time. To quote the relevant email from Taylor Singletary:

 Please don't depend on the exact format of the ID. As our infrastructure 
 needs evolve, we might need to tweak the generation algorithm again.

 If you've been trying to divine meaning from status IDs aside from their 
 role as a primary key, you won't be able to anymore. Likewise for usage of 
 IDs in mathematical operations -- for instance, subtracting two status IDs 
 to determine the number of tweets in between will no longer be possible

 Cheers.

 On Jun 8, 2010, at 0:06 , sahmed10 wrote:

 yes it works! This algorithm works
 Its something like this
 Set the query to a string with appropriate To and From dates. Then
 consuem the 1500 streaming results and also save the status id of the
 very last tweet you got. As they are in order sequentially(with gaps)
 it wont be a problem. The very last tweet status id should be assigned
 as the MaxId for the next set of results and so on.