Hi Ted,
Thank you for your
answer, I've fixed the config-lesk file to work as desired.
I tried also to use the
vector measure as well, and used the files in /utils to build a compound nouns
file and a vector db file,
However, it works awfully
slow – do you have any idea how to enhance that?
Best,
Yael
-----Original Message-----
From: ted pedersen
[mailto:[EMAIL PROTECTED]
Sent: Sunday, January 30, 2005
9:18 PM
To: wn-similarity@yahoogroups.com
Subject: Re: [wn-similarity] lesk
measure
Hi Yael,
The lesk measure has a number of options that are
controlled via its
configuration file, and if you have your
configuration file set
differently, or aren't using one, then the results
from the measure will
likely be different.
The most important configuration options for lesk
are probably the
relations file (which determines which relations
you use in carrying
out the gloss matching) and the stoplist (which
removes stop words
from the glosses).
In the web interface we use the stoplist as
provided with
WordNet-Similarity in the sample files, and we
also use the relation file
as provided (which uses all synsets within one
"link" of the target
synsets regardless of the type of relations).
So, I suspect you may be using a different set of
relations, or maybe not
using a stoplist? One way to get a quick sense of
what the differences
between your system and the web interface might be
by looking at the
traces. The effect of using (or not) a stoplist or
different relations is
usually pretty extreme.
Let us know if this doesn't explain the
differences, or if you'd like more
information about how the configuration options
are set in the web
interface!
Cordially,
Ted
On Sun, 30 Jan 2005, Yael Netzer wrote:
> Hi
>
>
>
> I am using wordnet-similarity to check
relatedness of word pairs.
>
> I use for the purpose the HSO measure and
LESK.
>
>
>
> For each word, I find all synsets from all
possible parts of speech and find
> the maximum measure from all possible
similarity measure couples.
>
>
>
> However, for the lesk I get much higher
scores than I get when I try the
> online version
>
> http://marimba.d.umn.edu/cgi-bin/similarity.cgi
>
>
>
> for instance:
>
>
>
> bird-poultry lesk: 490
>
> online version: 159
>
>
>
> This is my loop:
>
>
>
> foreach my $sensew1
(@wordPos1) {
>
>
foreach my $senseSensew1 ($wn->querySense($sensew1)) {
>
>
foreach my $sensew2 (@wordPos2) {
>
>
foreach my $senseSensew2 ($wn->querySense($sensew2))
> {
>
>
if ($senseSensew2 && $senseSensew1) {
>
>
my $lesk_relatedness =
>
$lesk_measure->getRelatedness($senseSensew1, $senseSensew2);
>
>
($lesk_error, $lesk_errString) =
> $lesk_measure->getError();
>
>
die $lesk_errString if($lesk_error > 1);
>
>
$lesk_rel = max($lesk_relatedness,
> $lesk_rel);
>
>
}
>
>
}
>
>
}
>
>
}
>
> }
>
>
>
> Can you explain me why the difference ?
>
>
>
> Thanks,
>
>
>
> Yael
>
>
--
Ted Pedersen
http://www.d.umn.edu/~tpederse
Yahoo! Groups Links