Thanks for a quick reply, Ted. Here is the kind of data I have as an example:
20% sales for watches, Ann
50% sales for watches, Peter
70% sales for watches, Tom
... etc.

Using ngram I am able to find a 'stem' of the phrase, which is 
'sales<>for<>watches<>,' and has a length of 4-gram. This is what I call a 
cluster. Additionally I add 2 extra parameters: date & size of the email. later 
I use them to evaluate the cluster.
Next, I take a line similar to the example, and I want to recognize the cluster 
to which it belongs or to know that it belongs to none in my list only by text.

I tried the sense cluster some time ago on my data, to be honest I was lost in 
it's options and the results I got were not good. You think there might be an 
option for such a task?

Cheers,
Jelena

--- In ngram@yahoogroups.com, Ted Pedersen <tpederse@...> wrote:
>
> Hi Jelena,
> 
> Could you describe in a little more detail how you are using Text::NSP as a
> part of the clustering work you describe?
> 
> http://ngram.sourceforge.net
> 
> Text::NSP doesn't have native support for clustering, although it's easy to
> imagine using it as a part of a larger clustering process (and in fact
> that's what we do in our SenseClusters package
> http://senseclusters.sourceforge.net ).
> 
> In any case, if you can describe where Text::NSP fits into this picture I'm
> sure we'll be able to make some suggestions.
> 
> Cordially,
> Ted
> 
> On Fri, Jan 14, 2011 at 9:51 AM, jelena_isacenkova <jelena.info@...>wrote:
> 
> >
> >
> > Hi guys,
> >
> > I am currently using the module to cluster the text lines (phrases) in
> > n-grams and it's working fine. I would like to also cluster the new incoming
> > text records based on the clustered data. I am sure it is a common thing to
> > do, however, it is not described in the package.
> >
> > I have an idea of using a token option for describing a list of already
> > existing cluster in order to check if one matches to any of existing
> > clusters. Would be very interested to know your opinion and comments.
> >
> > Thanks in advance,
> > Jelena
> >
> >  
> >
> 
> 
> 
> -- 
> Ted Pedersen
> http://www.d.umn.edu/~tpederse
>


Reply via email to