Hey,
This sounds like a collaborative filtering problem. But rule based system
alone might not be your best choice for such a dynamic environment like
twitter. I would say if u can develop a bag of word approach to write a
classifier and add that to your rule based system then u stand a good
chance, I would assume. I would assume a few hours worth of tweets from
Stream with classification done would serve a good training set for the
algorithm. I do not have any empirical evidence as of now, but that my hunch
about this.
Regards,
Atul.
On Sat, Feb 27, 2010 at 6:29 PM, Fabien Penso wrote:
> Hi,
>
> I'm currently using the streaming API for a new service I work on, but
> I see lots of tweets I would consider as SPAM and I'd like to find a
> way to prevent it.
>
> I have not found anything to filter them, therefor I wrote a little
> blog post about how it could be done. Something to combine RBL and
> Tweets, but I wonder if that makes sense.
>
> Any feedback welcome.
>
> http://blog.penso.info/2010/02/28/filtering-spams-on-twitter-twitterbl/
>
> --
> Fabien Penso
> @fabienpenso
>
--
Regards,
Atul Kulkarni