Hi, I am glad to see that to see you were able to make it working, I will try it as soon as possible. Probably something went wrong while downloading/applying/updating Mahout-60.
I am using the UC Berkeley annotated subset from that you can find in your link, here: http://bailando.sims.berkeley.edu/enron/enron_with_categories.tar.gz from here http://bailando.sims.berkeley.edu/enron_email.html. It's a multiple level label, each message can have a: Coarse genre, Included/forwarded information, Primary topics, Emotional tone (if not neutral) There is a .cats file associated with each label. I made a little utility that let you pick a label type, parse the cats file and output the message in appropriate labeled folder. Also, it's easy to just use the 1 to 8 subfolders in the tar, these folders are labeled by coarse genre. I can share this little app, if you want. I am very curious to see if I will be able to make it work. Thanks for the help, Philippe On Sun, Jul 27, 2008 at 11:29 AM, Robin Anil <[EMAIL PROTECTED]> wrote: > Also could you tell me which version of the enron Email corpus are you using > for classification. Please provide the link. I found tons of variations > online. What classification labels are you using (Email User Name?). > http://sgi.nu/enron/corpora.php >
