Re: [Mailman-Developers] GSOC 2013 project discussion

Avik Pal Wed, 17 Apr 2013 09:52:04 -0700

  ya I get your point, but see these are part of any machine learning
project, and feature extraction has to be done considering the synthetic
data set.



On 17 April 2013 22:05, Terri Oda <[email protected]> wrote:

>
>
> Finding sources of spam (like that one) isn't that hard; it's finding
> sources of legit email combined with spam and classified and processed in
> the same way that's challenging.  As I said, you can combine a spam source
> like this with a publicly available mailing list to make a synthetic set,
> but scientifically speaking, those aren't really preferred ways to handle
> data because they come from multiple sources.
>
>
>
    well in this regard the only thing I can do is keep looking, I am also
aware that coming from different sources can make them skewed but again
these things are never perfect and there are always scope for betterment, I
think that our aim should be to implement a rudimentary classifier with
fairly good performance to start with.
_______________________________________________
Mailman-Developers mailing list
[email protected]
http://mail.python.org/mailman/listinfo/mailman-developers
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: 
http://www.mail-archive.com/mailman-developers%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9

Re: [Mailman-Developers] GSOC 2013 project discussion

Reply via email to