Re: [Senseclusters-users] [Corpora-List] SenseClusters v0.95 released (now supports LSA)

Ted Pedersen Tue, 25 Nov 2008 06:35:16 -0800

Hi Rob,

There are no restrictions on what you can run on the web interface -
as a practical matter you might find that some operations take an
especially long time, but in general we don't notice any problems with
the machine getting overloaded.


If you continue to have install issues do let us know - there are a
few places where the install is more difficult than it should be (PDL
for example, can cause some trouble, and SVD is sometimes tricky
depending on which C compiler you are using...). But, we are usually
able to figure out most install difficulties.

As to using a phrase as a target word, there is really no reason you
can't do that. We have used words as our targets just because it suits
our purposes, but you can define the "structure" of your target words
using the --target option - that will give you the ability to define a
regular expression that indicates what should be found inside the head
tags...

As an off the cuff example, there is no reason your target expression
couldn't be something like the following (which you could provide via
the --target option)...

/<head>(I|She|We|You) (love|hate) fish(es)?</head>/

In general we use the term "words" very loosely - in general what we
really mean is a kind of token or string that can be anything from a
few characters to a longer phrase (this applies to features, target
words, etc...)  Please do let us know if you have any trouble with the
--target word option - this is something we haven't really used very
heavily here, but it was something we deliberately included in the
design due to reasons like you are describing (so we'll be curious to
see how well it works and if there are any unexpected glitches)!

I hope this helps!

Good luck,
Ted

On Mon, Nov 24, 2008 at 5:33 PM, Rob Koeling <[EMAIL PROTECTED]> wrote:
>
> Dear Ted and Anagha,
>
> I had a good play with the package. There were some problems with installing
> some of the packages here locally,
> therefore I also made good use of the web interface (and  gave your machines
> a bit of a workout). Let me know
> if there are any restrictions on size or preferred times, so I won't slow
> down other processes.
> Though, if all is well, the last problems with the local installation should
> have been solved by now.
>
> For my application, I'd like to cluster utterances based on the surrounding
> utterances. So far, I've experimented
> with 'headless contexts', where the context is the whole snippet of text
> consisting of the utterance itself plus the
> previous and next utterance. However, this is not really precise enough. It
> would be nice if I could create a 'headed
> context', where the 'head' is the utterance in question and the surrounding
> utterances the context.
> However, 'heads' seem to be limited to one word. Has anyone played with the
> idea of allowing, e.g., phrases
> to be the head? Ideally I would like to create features consisting of words
> from the target combining words from
> the context (like the target co-occurrence features, but then with any word
> in the phrase that is regarded the
> target). This  is probably fairly easy to implement (I haven't looked at the
> feature selection code yet).
> I suppose my question is, is there a good reason why targets can only be one
> word, or was that just
> an application driven decision?
>
> Best,
>
>   - Rob
>
>
>



-- 
Ted Pedersen
http://www.d.umn.edu/~tpederse

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
senseclusters-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/senseclusters-users

Re: [Senseclusters-users] [Corpora-List] SenseClusters v0.95 released (now supports LSA)

Reply via email to