Both, it is text mining and relationship operations. So Mahout and NLTK may
come into play. I know much more about text extraction (current project) and
less about those new text mining apps, but they are all the rage in
eDiscovery now, so to make my FreeEed useful and competitive, it will have
to incorporate them.

The test data usually comes from Enron emails,
http://edrm.net/resources/data-sets

On Mon, Mar 28, 2011 at 10:58 PM, Edward J. Yoon <[email protected]>wrote:

> Interesting!
>
> BTW, Isn't it a text-mining? or Do you mean the relational operations
> between senders and recipients?
>
> On Tue, Mar 29, 2011 at 12:07 PM, Mark Kerzner <[email protected]>
> wrote:
> > Well,
> >
> > I am planning to use Hama and Ravel to do email chaining for eDiscovery.
> You
> > are given a large set of emails, and you want to construct email chains,
> and
> > produce some intelligent deductions about who knew what when. I am
> working
> > on the basic processing now, FreeEed<
> https://github.com/markkerzner/FreeEed>,
> > and that would be one of my next steps.
> >
> > Mark
> >
> > On Mon, Mar 28, 2011 at 9:48 PM, Edward J. Yoon <[email protected]
> >wrote:
> >
> >> Just watched video clip.
> >>
> >> Unfortunately, he seems lack of - So what's the killer apps?
> >>
> >> On Tue, Mar 29, 2011 at 9:19 AM, Edward J. Yoon <[email protected]>
> >> wrote:
> >> > Have anyone interest in Hama for enterprise? :D
> >> >
> >> > On Tue, Mar 29, 2011 at 8:47 AM, Edward J. Yoon <[email protected]>
> >> wrote:
> >> >> http://gigaom.com/cloud/ravel-hopes-to-open-source-graph-databases/
> >> >>
> >> >> Sent from my iPhone
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Best Regards, Edward J. Yoon
> >> > http://blog.udanax.org
> >> > http://twitter.com/eddieyoon
> >> >
> >>
> >>
> >>
> >> --
> >> Best Regards, Edward J. Yoon
> >> http://blog.udanax.org
> >> http://twitter.com/eddieyoon
> >>
> >
>
>
>
> --
> Best Regards, Edward J. Yoon
> http://blog.udanax.org
> http://twitter.com/eddieyoon
>

Reply via email to