Hey, This sounds like a collaborative filtering problem. But rule based system alone might not be your best choice for such a dynamic environment like twitter. I would say if u can develop a bag of word approach to write a classifier and add that to your rule based system then u stand a good chance, I would assume. I would assume a few hours worth of tweets from Stream with classification done would serve a good training set for the algorithm. I do not have any empirical evidence as of now, but that my hunch about this.
Regards, Atul. On Sat, Feb 27, 2010 at 6:29 PM, Fabien Penso <fabienpe...@gmail.com> wrote: > Hi, > > I'm currently using the streaming API for a new service I work on, but > I see lots of tweets I would consider as SPAM and I'd like to find a > way to prevent it. > > I have not found anything to filter them, therefor I wrote a little > blog post about how it could be done. Something to combine RBL and > Tweets, but I wonder if that makes sense. > > Any feedback welcome. > > http://blog.penso.info/2010/02/28/filtering-spams-on-twitter-twitterbl/ > > -- > Fabien Penso > @fabienpenso > -- Regards, Atul Kulkarni