On Sat, May 23, 2009 at 9:09 AM, J... <celebur...@gmail.com> wrote: > > I am curious what kind of server/hosting plans are used for sites like > tweetmeme.com where twitter links are being indexed.
I operate TwURLed News, which is doing that sort of thing ( http://twURLedNews.com). The database and analytics code are running on Python and MySQL on a Intel-based BSD box. The site is hosted at Bluehost, using Wordpress. I originally had the database there, too, but it looked like it would eat too many CPU cycles, so I moved it back to a machine at my office. TwURLed News doesn't need a lot of processing power because it doesn't try to drink the entire firehose. It follows and crawls Twitter users based on their track record of citing URLs that became popular, their proximity in the social network to such people and their use of two-word phrases used by such people. In other words, recursive graph exploration in which citing a URL that becomes popular adds to your weight in the graph. Every few minutes, it publishes the URLs that were posted by the currently highest scoring people. It also follows the highest-scoring people, periodically un-following aggregators and those whose scores have fallen too low. At the moment it is following about 2,000 people (http://twitter.com/twurlednews). Nick