Both, it is text mining and relationship operations. So Mahout and NLTK may come into play. I know much more about text extraction (current project) and less about those new text mining apps, but they are all the rage in eDiscovery now, so to make my FreeEed useful and competitive, it will have to incorporate them.
The test data usually comes from Enron emails, http://edrm.net/resources/data-sets On Mon, Mar 28, 2011 at 10:58 PM, Edward J. Yoon <[email protected]>wrote: > Interesting! > > BTW, Isn't it a text-mining? or Do you mean the relational operations > between senders and recipients? > > On Tue, Mar 29, 2011 at 12:07 PM, Mark Kerzner <[email protected]> > wrote: > > Well, > > > > I am planning to use Hama and Ravel to do email chaining for eDiscovery. > You > > are given a large set of emails, and you want to construct email chains, > and > > produce some intelligent deductions about who knew what when. I am > working > > on the basic processing now, FreeEed< > https://github.com/markkerzner/FreeEed>, > > and that would be one of my next steps. > > > > Mark > > > > On Mon, Mar 28, 2011 at 9:48 PM, Edward J. Yoon <[email protected] > >wrote: > > > >> Just watched video clip. > >> > >> Unfortunately, he seems lack of - So what's the killer apps? > >> > >> On Tue, Mar 29, 2011 at 9:19 AM, Edward J. Yoon <[email protected]> > >> wrote: > >> > Have anyone interest in Hama for enterprise? :D > >> > > >> > On Tue, Mar 29, 2011 at 8:47 AM, Edward J. Yoon <[email protected]> > >> wrote: > >> >> http://gigaom.com/cloud/ravel-hopes-to-open-source-graph-databases/ > >> >> > >> >> Sent from my iPhone > >> >> > >> > > >> > > >> > > >> > -- > >> > Best Regards, Edward J. Yoon > >> > http://blog.udanax.org > >> > http://twitter.com/eddieyoon > >> > > >> > >> > >> > >> -- > >> Best Regards, Edward J. Yoon > >> http://blog.udanax.org > >> http://twitter.com/eddieyoon > >> > > > > > > -- > Best Regards, Edward J. Yoon > http://blog.udanax.org > http://twitter.com/eddieyoon >
