Re: GSOC

Ted Dunning Wed, 23 Mar 2011 09:39:49 -0700

Another important question is whether this is something that is Mahout-ish.

Mahout is a project that supports scalable data mining.  That currently
includes a mature recommendation framework, less mature clustering and
classification tools and a smattering of other tools.

What you are proposing sounds a bit more like an application made up of
different tools, possibly some from Mahout, and some from other sources.

How do you see this?

On Wed, Mar 23, 2011 at 9:37 AM, Ted Dunning <[email protected]> wrote:

> Let's take this back to the mailing list so all can see.
>
> If you are familiar with the stanford parser, then this seems like a
> feasible project for you to accomplish.  I would expect that very similar
> results could be achieved using simple word or phrase counts, possibly with
> the addition of a chunker.  My guess is that the parser would add very
> little.
>
> Stefan Henß did some interesting and very simple work, for instance, for
> automated FAQ generation that avoided parsing:
>
>
> http://mail-archives.apache.org/mod_mbox/mahout-user/201102.mbox/%[email protected]%3E
>
> On Wed, Mar 23, 2011 at 3:24 AM, Harsh <[email protected]> wrote:
>
>> I want to build over the Stanford parser (the one I am familiar with) and
>> want to create a dependency graph for the sentences. The most occurring
>> words in any paragraph generally depicts its theme. With the help of the
>> dependency developed and word count, I want to guess the theme of the
>> paragraph.
>>
>>

Re: GSOC

Reply via email to