On Sat, May 23, 2009 at 9:09 AM, J... <celebur...@gmail.com> wrote:

>
> I am curious what kind of server/hosting plans are used for sites like
> tweetmeme.com where twitter links are being indexed.


I operate TwURLed News, which is doing that sort of thing (
http://twURLedNews.com).  The database and analytics code are running on
Python and MySQL on a Intel-based BSD box.  The site is hosted at Bluehost,
using Wordpress.  I originally had the database there, too, but it looked
like it would eat too many CPU cycles, so I moved it back to a machine at my
office.

TwURLed News doesn't need a lot of processing power because it doesn't try
to drink the entire firehose.  It follows and crawls Twitter users based on
their track record of citing URLs that became popular, their proximity in
the social network to such people and their use of two-word phrases used by
such people.  In other words, recursive graph exploration in which citing a
URL that becomes popular adds to your weight in the graph.  Every few
minutes, it publishes the URLs that were posted by the currently highest
scoring people.  It also follows the highest-scoring people, periodically
un-following aggregators and those whose scores have fallen too low.  At the
moment it is following about 2,000 people (http://twitter.com/twurlednews).

Nick

Reply via email to