no...i dont have any idea of crawling twitter. Cheers, Arunachalam
On Mon, Jun 29, 2009 at 11:34 PM, Doug Williams <[email protected]> wrote: > Please do not scrape our site. We have processes in place that will > automatically block your spiders. > > If you feel that you have a compelling need for vast amounts of data please > email the API team [1] with a detailed description of your needs and the > value you hope to create and let's have a conversation. > > 1. http://apiwiki.twitter.com/Support > > Thanks, > Doug > > > > > > > On Mon, Jun 29, 2009 at 10:30 AM, Scott Haneda <[email protected]>wrote: > >> I dint think this is a function of a workaround. This is a function of >> Twitter having a good policy in place to prevent abuse. >> >> You can do what you want by incrementally querrying the API. The API >> limits will make it take too long. Even with multiple accounts it will be >> months before you get a final list. Even then, I'm not sure you could keep >> on top of new user registrations. >> >> Having acces to this data could only be used for nefarious efforts. What >> you want would be a spammers dream. >> >> I think you would be better and faster to build a crawl farm and crawl all >> links on Twitter.com and parse the users out, bypassing the API. >> >> Even with the API, as you add new records, those records you just added >> will expire, delete, get banned, blocked etc. There is no way you could ever >> have a reconciled system. >> >> Consider if each username is an average 10 bytes. You have 520,000,000 >> bytes to download of just username data. Let's double that for http overhead >> and other misc data that will come over the wire. 1 billion bytes. >> >> That's a strongly conservative terrabyte of data that you would have to >> download once a day and reconcile against the previous day. A terrabyte of >> just usernames. >> >> Then you have all the CPU that you will need, network lag, time to insert >> into your data source. >> >> This is not something that can be worked around. This is simply a >> limitation of scale, one that can not be overcome. You need a direct link to >> twitters data sources, ideally from within their data center to reduce >> network lag. This probably will not be approved :) >> -- >> Scott >> Iphone says hello. >> >> On Jun 29, 2009, at 9:06 AM, Arunachalam <[email protected]> wrote: >> >> Even if i have my account whitelisted, which have 20,000 request per hour, >> i need to run for many days which is not feasible. Any other workaround. >> >> Any other way to get rid of these request limit. >> >> Cheers, >> Arunachalam >> >> >> On Mon, Jun 29, 2009 at 7:01 PM, Abraham Williams < <[email protected]> >> [email protected]> wrote: >> >>> >>> There has been over 52000000 profiles created. You could just start at >>> 1 and count up. Might take you a while though. >>> >>> Abraham >>> >>> On Mon, Jun 29, 2009 at 07:55, Arunachalam< <[email protected]> >>> [email protected]> wrote: >>> > Any idea how to implement the same using php / any other language. >>> > Im confused abt the implementation. >>> > >>> > Cheers, >>> > Arunachalam >>> > >>> > >>> > On Mon, Jun 29, 2009 at 5:57 PM, Cameron Kaiser <<[email protected]> >>> [email protected]> >>> > wrote: >>> >> >>> >> > I am looking to find the entire twitter user list ids. >>> >> > >>> >> > Social graph method provides the way to fetch the friends and >>> followers >>> >> > id, >>> >> > thorough which we can access the profile of the person using user >>> method >>> >> > - >>> >> > show. But this requires a code to be written to recursively crawl >>> the >>> >> > list >>> >> > from any starting id and appending the followers and friends id of >>> the >>> >> > person without duplicating. >>> >> > >>> >> > Do we have any other API to get entire list. If not, any other ways >>> >> > apart >>> >> > from crawling to get the entire list. >>> >> >>> >> No, and no, there are no other ways. >>> >> >>> >> -- >>> >> ------------------------------------ personal: >>> >> <http://www.cameronkaiser.com/>http://www.cameronkaiser.com/ -- >>> >> Cameron Kaiser * Floodgap Systems * <http://www.floodgap.com> >>> www.floodgap.com * >>> >> <[email protected]>[email protected] >>> >> -- Careful with that Axe, Eugene. -- Pink Floyd >>> >> ------------------------------- >>> > >>> > >>> >>> >>> >>> -- >>> Abraham Williams | Community Evangelist | <http://web608.org> >>> http://web608.org >>> Hacker | <http://abrah.am>http://abrah.am | <http://twitter.com/abraham> >>> http://twitter.com/abraham >>> Project | <http://fireeagle.labs.poseurtech.com> >>> http://fireeagle.labs.poseurtech.com >>> This email is: [ ] blogable [x] ask first [ ] private. >>> >> >> >
