[twitter-dev] Re: Twitter User List

Arunachalam Mon, 29 Jun 2009 12:09:54 -0700

no...i dont have any idea of crawling twitter.

Cheers,
Arunachalam



On Mon, Jun 29, 2009 at 11:34 PM, Doug Williams <[email protected]> wrote:

> Please do not scrape our site. We have processes in place that will
> automatically block your spiders.
>
> If you feel that you have a compelling need for vast amounts of data please
> email the API team [1] with a detailed description of your needs and the
> value you hope to create and let's have a conversation.
>
> 1. http://apiwiki.twitter.com/Support
>
> Thanks,
> Doug
>
>
>
>
>
>
> On Mon, Jun 29, 2009 at 10:30 AM, Scott Haneda <[email protected]>wrote:
>
>> I dint think this is a function of a workaround. This is a function of
>> Twitter having a good policy in place to prevent abuse.
>>
>> You can do what you want by incrementally querrying the API. The API
>> limits will make it take too long. Even with multiple accounts it will be
>> months before you get a final list. Even then, I'm not sure you could keep
>> on top of new user registrations.
>>
>> Having acces to this data could only be used for nefarious efforts. What
>> you want would be a spammers dream.
>>
>> I think you would be better and faster to build a crawl farm and crawl all
>> links on Twitter.com and parse the users out, bypassing the API.
>>
>> Even with the API, as you add new records, those records you just added
>> will expire, delete, get banned, blocked etc. There is no way you could ever
>> have a reconciled system.
>>
>> Consider if each username is an average 10 bytes. You have 520,000,000
>> bytes to download of just username data. Let's double that for http overhead
>> and other misc data that will come over the wire. 1 billion bytes.
>>
>> That's a strongly conservative terrabyte of data that you would have to
>> download once a day and reconcile against the previous day. A terrabyte of
>> just usernames.
>>
>> Then you have all the CPU that you will need, network lag, time to insert
>> into your data source.
>>
>> This is not something that can be worked around. This is simply a
>> limitation of scale, one that can not be overcome. You need a direct link to
>> twitters data sources, ideally from within their data center to reduce
>> network lag. This probably will not be approved :)
>> --
>> Scott
>> Iphone says hello.
>>
>> On Jun 29, 2009, at 9:06 AM, Arunachalam <[email protected]> wrote:
>>
>> Even if i have my account whitelisted, which have 20,000 request per hour,
>> i need to run for many days which is not feasible. Any other workaround.
>>
>> Any other way to get rid of these request limit.
>>
>> Cheers,
>> Arunachalam
>>
>>
>> On Mon, Jun 29, 2009 at 7:01 PM, Abraham Williams < <[email protected]>
>> [email protected]> wrote:
>>
>>>
>>> There has been over 52000000 profiles created. You could just start at
>>> 1 and count up. Might take you a while though.
>>>
>>> Abraham
>>>
>>> On Mon, Jun 29, 2009 at 07:55, Arunachalam< <[email protected]>
>>> [email protected]> wrote:
>>> > Any idea how to implement the same using php / any other language.
>>> > Im confused abt the implementation.
>>> >
>>> > Cheers,
>>> > Arunachalam
>>> >
>>> >
>>> > On Mon, Jun 29, 2009 at 5:57 PM, Cameron Kaiser <<[email protected]>
>>> [email protected]>
>>> > wrote:
>>> >>
>>> >> > I am looking to find the entire twitter user list ids.
>>> >> >
>>> >> > Social graph method provides the way to fetch the friends and
>>> followers
>>> >> > id,
>>> >> > thorough which we can access the profile of the person using user
>>> method
>>> >> > -
>>> >> > show. But this requires a code to be written to recursively crawl
>>> the
>>> >> > list
>>> >> > from any starting id and appending the followers and friends id of
>>> the
>>> >> > person without duplicating.
>>> >> >
>>> >> > Do we have any other API to get entire list. If not, any other ways
>>> >> > apart
>>> >> > from crawling to get the entire list.
>>> >>
>>> >> No, and no, there are no other ways.
>>> >>
>>> >> --
>>> >> ------------------------------------ personal:
>>> >> <http://www.cameronkaiser.com/>http://www.cameronkaiser.com/ --
>>> >>  Cameron Kaiser * Floodgap Systems * <http://www.floodgap.com>
>>> www.floodgap.com *
>>> >> <[email protected]>[email protected]
>>> >> -- Careful with that Axe, Eugene. -- Pink Floyd
>>> >> -------------------------------
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Abraham Williams | Community Evangelist | <http://web608.org>
>>> http://web608.org
>>> Hacker | <http://abrah.am>http://abrah.am | <http://twitter.com/abraham>
>>> http://twitter.com/abraham
>>> Project | <http://fireeagle.labs.poseurtech.com>
>>> http://fireeagle.labs.poseurtech.com
>>> This email is: [ ] blogable [x] ask first [ ] private.
>>>
>>
>>
>

[twitter-dev] Re: Twitter User List

Reply via email to